Skip to main content
Journal of Clinical Microbiology logoLink to Journal of Clinical Microbiology
. 2006 Dec 13;45(2):522–528. doi: 10.1128/JCM.02136-06

A Cautionary Tale: Lack of Consistency in Allele Sizes between Two Laboratories for a Published Multilocus Microsatellite Typing System

Alessandro C Pasqualotto 1, David W Denning 1,*, Michael J Anderson 1
PMCID: PMC1829014  PMID: 17166958

Abstract

For species with low genetic diversity, typing using the differences in PCR fragment length resulting from variations in numbers of short tandem repeats has been shown to provide a high level of discrimination. This technique has been called multilocus microsatellite typing (MLMT) or multiple-locus variable-number tandem repeat analysis, and studies usually employ genetic or sequence analyzers to size PCR fragments to a high degree of precision. We set out to validate one such system that has been developed for Aspergillus fumigatus (H. A. de Valk, J. F. G. M. Meis, I. M. Curfs, K. Muehlethaler, J. W. Mouton, and C. H. W. Klaassen, J. Clin. Microbiol. 43:4112-4120, 2005). The sizes of the alleles were compared both by sequencing and from two genotyping laboratories, where they used capillary electrophoresis (CE) for sizing. Size differences of up to 6 bases were found between the actual sizes reported by sequencing and the sizes reported by CE. In addition, because the two genotyping laboratories used different machines and running conditions, differences of up to 3 bases were identified between them. As the microsatellite markers used differ by repeat units of 3 or 4 bases, it was not possible to assign PCR fragments to the correct alleles without confirming the sizes of a range of alleles by direct sequencing. Lines of best fit were plotted for each CE machine against actual sizes and will therefore enable unsequenced PCR fragments to be assigned to the correct alleles. This study highlights the care required to ensure that an MLMT system undergoes a suitable correction procedure before data can be merged between different laboratories involved in the typing of individual species.


Multilocus sequence typing (MLST) is well established as the gold standard in microbiological research for epidemiological and population genetic studies and involves the detection of point mutations in sequenced PCR fragments (29). However, for organisms with low levels of genetic diversity, such as Yersinia pestis (24), Leishmania infantum (31), and Aspergillus fumigatus (3), markers that have higher mutation rates are required. Microsatellites are genomic sequences consisting of tandemly repeated short motifs of 2 to 6 nucleotides. Mutations occur mainly by replication slippage and result in changes in the numbers of repeat units. These mutations occur at rates of 10−2 to 10−6 per generation, which compare with 10−9 for point mutations.

PCR primers are designed for the unique region on each side of the microsatellite, and as markers differ by multiples of the repeat unit, the length of the PCR fragment is used to assign an allele. One of the primers is labeled with a fluorescent tag, and the PCR product is run under denaturing conditions, using either slab-based polyacrylamide gel electrophoresis (SE) or capillary-based polyacrylamide-derived gel electrophoresis (CE). The mobility of the labeled single-stranded DNA molecule is dependent on both its length and its sequence. Commercially available machines are sold as sequence or genetic analyzers, and there has been a switch within the last decade from SE to CE machines. Up to four or five fluorescent labels can be detected by these machines, so the standard approach is to analyze three differently labeled markers with a commercially available size standard in one capillary. An appropriate calibration method can then be applied to assign sizes to the fragments of interest.

The use of microsatellite markers and this technology has been thoroughly tested in the genotyping, forensic, and molecular ecology fields. Their limitations and sources of error have been detailed in various methodological papers. In contrast, there seems to have been less discussion of these potential problems within the microbial typing field. Although many studies have been published for viruses, bacteria, fungi, and parasites, the only problems that are commonly mentioned are those associated with the PCR itself; these include the addition of an adenylate to the 3′ end of the DNA molecule by Taq DNA polymerase (terminal transferase activity) (4, 7, 11, 19, 23, 26, 39) and molecules that are one, two, or three repeat units shorter as a consequence of replication slippage during strand synthesis (1, 4, 11, 19, 22, 26, 34). This slippage is usually more of a problem with longer repeats consisting of dinucleotide units and is detectable as “stutter bands” on electrophoretic traces. As a consequence, this form of typing, variously called multilocus microsatellite typing (MLMT) (14), variable-number tandem repeat typing (VNTR), or multiple-locus variable-number tandem repeat analysis (37), is sold as highly discriminatory and relatively easy and rapid as well as reproducible. This technology has been described as readily portable since alleles can be assigned either on the basis of the total length of the PCR fragment or on the number of repeat units. It should, therefore, be possible to set up international databases where the alleles and genotypes for isolates from many laboratories can be entered and compared.

Two microsatellite typing systems have been published for the filamentous fungal opportunistic pathogen Aspergillus fumigatus (4, 11), and they have been proposed as suitable for large-scale epidemiological studies. The earlier scheme has been used for many years to type isolates in our laboratory (4). Because we did not usually have access to a properly set-up machine, isolates were assigned allele sizes by direct sequencing, and the sizes that we obtained were concordant with those described in this scheme. The publication of the de Valk et al. study in 2005 (11) prompted us to validate it as a suitable typing system. We did this by determining the sizes of alleles by direct sequencing and by sizing single strands relative to internal size markers in two laboratories offering genotyping services. The results of this study were to show that (i) the size obtained from a CE machine does not always correspond to the size obtained by sequencing and that (ii) the use of different machines can result in the assignment of different sizes for the same allele. We believe that knowledge of these technical issues will assist other laboratories whose members are thinking of setting up microsatellite typing systems for their microorganisms of interest.

(Part of this work was presented at the 2nd Trends in Medical Mycology meeting, 2005 [32a].)

MATERIALS AND METHODS

Isolates.

Eleven clinical isolates of A. fumigatus were typed from our culture collection.

DNA extraction.

Mycelium from overnight liquid cultures was used for DNA extraction. Two hundred milligrams (wet weight) was placed in a Lysing Matrix A tube (FastDNA Kit; Q-BIOgene, Cambridge, United Kingdom). Buffer AP1 was added (800 μl; DNeasy Plant mini kit; QIAGEN, Crawley, United Kingdom), and the cells were disrupted using a FastPrep FP120 homogenizer (Q-BIOgene, Cambridge, United Kingdom) twice at speed 5.0 for 45 seconds. RNase A was added (8 μl), and from that point, the DNeasy Plant Mini Handbook (January 2004 revision) was followed, except that double the volume of buffer AP2 was used (260 μl).

PCR amplification and analysis.

Reaction conditions and primer pairs used for microsatellite analysis were as described by de Valk et al. (11), except as described below. According to the genomic sequence of the published strain (30), the sequence of the reverse primer for STRAf 4C is incorrect and should be TCCAACCCATCCAATTCGTAA. Single rather than multiplex PCRs were set up, since multiplex reactions did not work so consistently and it was easier to check for the presence of single PCR products on agarose gels before analyzing them on a CE machine. Primers were used at 1 μM for each primer, with 10 ng of genomic DNA per 25-μl reaction. The forward STRAf 3C and STRAf 4C primers were labeled with NED (2,7′,8′-benzo-5′-fluoro-2′,4,7-trichloro-5-carboxyfluorescein) instead of TET (6-carboxy-4,7,2′,7′-tetrachlorofluorescein) in order to minimize interference with the other labels. PCR amplifications were performed with 1 U of FastStart Taq DNA polymerase (Roche Diagnostics, Burgess Hill, United Kingdom) in an iCycler thermal cycler (Bio-Rad, Hemel Hempstead, United Kingdom). Products were pooled according to their loci as M3 (STRAf 3A, 3B, and 3C) or M4 (STRAf 4A, 4B, and 4C) and sent for analysis by capillary electrophoresis. University service facilities were used in the Biological Sciences department at The University of Warwick (Coventry, United Kingdom) (hereinafter referred to as “Warwick”) and at the Advanced Biotechnology Centre, Imperial College (London, United Kingdom) (hereinafter referred to as “Imperial”). The CE machines and running conditions used in each laboratory are shown in Table 1. Depending on the signal intensity, differing dilutions were used, ranging from neat to 1:150.

TABLE 1.

Running conditions used in the two genotyping laboratories

Laboratory Machine Length of capillaries (cm) Run voltage (kV) Injection voltage (kV) Injection time (s) Temp (°C) Polymer Size standard Calibration method
Warwick Applied Biosystems 3130 genetic analyzer 36 15 1.2 23 60 POP-7 GeneScan 500 LIZ Local Southern
Imperial ABI PRISM 310 genetic analyzer 47 15 15 5 60 POP-4 GeneScan 500 ROX Local Southern

DNA sequence analysis.

The six microsatellite loci were sequenced for all 11 isolates used in this study. PCRs were run as described above, except with nonfluorescent primers. BigDye Terminator ready reaction mixture (version 3.1; Applied Biosystems, Warrington, United Kingdom) was used for cyclic sequencing. Final products were ethanol precipitated and analyzed in an ABI PRISM 3100 genetic analyzer.

Data analysis.

Size differences were calculated as the differences between the actual sizes (obtained by sequencing) and the mean allele sizes obtained from each CE machine and similarly for the differences between the sizes obtained from the three CE machines. For comparison of allele sizes obtained from sequencing and capillary electrophoresis, the correlation and the linear regression equation were calculated for each marker using Microsoft Excel.

RESULTS

The de Valk et al. study (11) involved the development of nine microsatellite markers for typing A. fumigatus, each of which consists of three repeats of dinucleotide repeat units, three of trinucleotide repeat units, and three of tetranucleotide repeat units. Each unlabeled primer has been designed to ensure that Taq DNA polymerase terminal transferase activity results in the addition of a 3′ adenylate to the labeled DNA strand. This removes one source of error in assigning the correct sizes to alleles. As also mentioned in that study (11), stutter bands are a problem especially with dinucleotide repeat units but also to a lesser extent with trinucleotide repeat units. Because a sufficiently high level of discrimination is possible with the use of the tri- and tetranucleotide repeat unit markers, the problem of assessing stutter bands was minimized in our study by avoiding the dinucleotide repeat unit markers as recommended by de Valk et al. (11). They proposed the use of CE machines and stated that “the assay combines high reproducibility with the easy exchange of results” (11).

On this basis, we set out to validate this typing panel in our laboratory. Eleven isolates of A. fumigatus were used, and initially, every allele for all six markers was sequenced. This generated a total of 42 sequenced alleles and resulted in the first source of conflict, as the sizes we obtained for an allele containing a specific number of repeat units differed by up to 5 bases from the sizes reported by de Valk et al. (11) (Table 2). These size differences were both larger and smaller than the sequenced sizes, so, for instance, a STRAf 3B allele containing 35 repeat units was 5.0 bases shorter by CE analysis (240 versus 235.0), whereas the STRAf 3A allele with 46 repeat units was 4.8 bases larger (245 versus 249.8). One of the isolates that we analyzed was Af293, a strain whose genome has been sequenced (30), and although there is agreement on the number of repeat units, the sizes that we obtained from both direct sequencing and the published genome sequence differ by up to 4.8 bases from the sizes recorded by de Valk et al. (11). Interestingly, although the STRAf 3B allele for Af293 contains 20 repeat units, it is actually 3 bases longer than expected, as it contains an additional 3 bases just 5′ of the repeat region, which are not present in other isolates. If this extra sequence were to be found in the general population, then this marker would be less than ideal for use in population genetic studies, as it would not be possible on the basis of size alone to know the actual sequence of each allele. The other markers should be well suited for any population genetics study, especially as they are all on separate chromosomes (3A, 3C, 4A, 4B, and 4C are on chromosomes 4, 3, 6, 7, and 8, respectively). It is also of particular interest to note that marker STRAf 3A is located within the coding region of a gene (the Afu4g09070 gene), with the resulting expansion and contraction of a run of glutamate residues in the encoded protein, and that this is by far, along with STRAf 3C, the most polymorphic marker (11).

TABLE 2.

Comparison between fragment sizes (in numbers of bases) revealed by DNA sequencing and capillary electrophoresis

Allele Size obtained by sequencinga No. of repeats Size obtained by capillary electrophoresis from indicated source (difference from size obtained by sequencing)
Difference between sizes obtained from indicated sources
de Valke Warwick Imperial de Valke vs Warwick de Valke vs Imperial Warwick vs Imperial
STRAf 3A 158 17 159.7 (+1.7) 159.3 (+1.3) 157.1 (−0.9) −0.4 −2.6 −2.2
164 19 166.5 (+2.5) 165.7 (+1.7) 163.6 (−0.4) −0.8 −2.9 −2.1
173 22 175.3 (+2.3) 175.0 (+2.0) 172.6 (−0.4) −0.3 −2.7 −2.4
185 26 187.8 (+2.8) 187.2 (+2.2) 184.7 (−0.3) −0.6 −3.1 −2.5
224 39 228.4 (+4.4) 226.3 (+2.3) 223.4 (−0.6) −2.1 −5.0 −2.9
233 42 236.7d (+3.7) 235.1 (+2.1) 232.7 (−0.3) −1.6 −4.0 −2.4
242 45 246.8d (+4.8) 244.7 (+2.7) 241.7 (−0.3) −2.1 −5.1 −3.0
245 46c 249.8 (+4.8) 247.4 (+2.4) 244.7 (−0.3) −2.4 −5.1 −2.7
STRAf 3B 162 9 160.5 (−1.5) 160.1 (−1.9) 159.2 (−2.8) −0.4 −1.3 −0.9
165 10 163.5 (−1.5) 163.1 (−1.9) 162.1 (−2.9) −0.4 −1.4 −1.0
168 11 166.5 (−1.5) 165.9 (−2.1) 165.2 (−2.8) −0.6 −1.3 −0.7
195 20 192.6d (−2.4) 191.6 (−3.4) 191.3 (−3.7) −1.0 −1.3 −0.3
201 22 198.0 (−3.0) 197.1 (−3.9) 196.9 (−4.1) −0.9 −1.1 −0.2
240 35 235.0d (−5.0) 233.7 (−6.3) 234.2 (−5.8) −1.3 −0.8 +0.5
STRAf 3C 96 11 95.2d (−0.8) 94.3 (−1.7) 92.4 (−3.6) −0.9 −2.8 −1.9
114 17 114.0 (0.0) 112.5 (−1.5) 111.1 (−2.9) −1.5 −2.9 −1.4
117 18 117.2 (+0.2) 115.6 (−1.4) 114.0 (−3.0) −1.6 −3.2 −1.6
120 19 120.5 (+0.5) 118.4 (−1.6) 117.1 (−2.9) −2.1 −3.4 −1.3
132 23c 132.5d (+0.5) 130.7 (−1.3) 129.5 (−2.5) −1.8 −3.0 −1.2
135 24 135.7 (+0.7) 134.0 (−1.0) 132.6 (−2.4) −1.7 −3.1 −1.4
141 26 141.9d (+0.9) 140.3 (−0.7) 138.9 (−2.1) −1.6 −3.0 −1.4
153 30 154.4d (+1.4) 154.0 (+1.0) 152.1 (−0.9) −0.4 −2.3 −1.9
165 34 166.9 (+1.9) 166.3 (+1.3) 165.0 (0.0) −0.6 −1.9 −1.3
207 48 209.9d (+2.9) 208.4 (+1.4) 207.1 (+0.1) −1.5 −2.8 −1.3
STRAf 4A 178 8 180.6 (+2.6) 180.2 (+2.2) 178.3 (+0.3) −0.4 −2.3 −1.9
186 10 188.9 (+2.9) 188.3 (+2.3) 186.2 (+0.2) −0.6 −2.7 −2.1
190 11c 192.9d (+2.9) 192.2 (+2.2) 190.0 (−0.0) −0.7 −2.9 −2.2
194 12 197.4 (+3.4) 196.4 (+2.4) 194.1 (+0.1) −1.0 −3.3 −2.3
202 14 205.4d (+3.4) 204.3 (+2.3) 201.9 (−0.1) −1.1 −3.5 −2.4
206 15 209.6 (+3.6) 208.2 (+2.2) 206.0 (0.0) −1.4 −3.6 −2.2
209b 16 212.7d (+3.7) 211.4 (+2.4) 209.0 (0.0) −1.3 −3.7 −2.4
213b 17 216.7 (+3.7) 215.2 (+2.2) 213.1 (+0.1) −1.5 −3.6 −2.1
222 19 226.3 (+4.3) 224.7 (+2.7) 222.0 (0.0) −1.6 −4.3 −2.7
STRAf 4B 166 5 166.7 (+0.7) 166.0 (0.0) 164.3 (−1.7) −0.7 −2.4 −1.7
178 8 179.4 (+1.4) 178.2 (+0.2) 176.3 (−1.7) −1.2 −3.1 −1.9
182 9 183.0 (+1.0) 182.3 (+0.3) 180.5 (−1.5) −0.7 −2.5 −1.8
186 10c 187.6 (+1.6) 186.2 (+0.2) 184.4 (−1.6) −1.4 −3.2 −1.8
190 11 191.6 (+1.6) 190.4 (+0.4) 188.3 (−1.7) −1.2 −3.3 −2.1
STRAf 4C 163 5 164.0 (+1.0) 163.8 (+0.8) 163.0 (0.0) −0.2 −1.0 −0.8
171 7 172.5 (+1.5) 172.2 (+1.2) 170.6 (−0.4) −0.3 −1.9 −1.6
175 8c 176.4 (+1.4) 176.2 (+1.2) 174.7 (−0.3) −0.2 −1.7 −1.5
267 31 270.9d (+3.9) 269.3 (+2.3) 267.2 (+0.2) −1.6 −3.7 −2.1
a

Includes additional 3′ adenylate resulting from terminal transferase activity.

b

Contains the same 1-base deletion in the 5′ unique region.

c

Af293 allele.

d

Allele not reported by de Valk et al. (11). Sizes were extrapolated from the nearest-sized allele.

e

de Valk, de Valk et al. (11).

In addition to the 3-base insertion found in Af293, we identified a 1-base deletion in the 5′ unique region of two alleles from marker STRAf 4A such that alleles with 16 and 17 repeat units were in fact 1 base smaller than expected. We confirmed that the nearest-sized alleles (with 15 and 19 repeat units) did not contain this deletion. The de Valk et al. study (11) identified alleles with 216.7 and 221.1 bases, which the authors assumed contained repeats where the repeat units differed by 3 bases rather than 4. However, this was not confirmed by sequencing, so we have presumed that these represent alleles with the 1-base deletion in the 5′ unique region and that they contain 17 and 18 repeat units, respectively. The fact that alleles with fewer and more repeat units do not have this deletion suggests that alleles with repeat units ranging in number from 16 to 18 are present in the A. fumigatus population both with and without this deletion. We have, subsequent to this study, in fact identified an isolate with a STRAf 4A allele size of 214 bases, which is consistent with a repeat containing 17 units and no 5′ unique region deletion.

Not only did the sizes of alleles differ depending on whether they had been determined by sequencing or by using a CE machine, we also discovered that different machines can give differing sizes. We decided to illustrate this point by sending our samples to two laboratories offering genotyping services and by comparing these data with those produced by de Valk et al. (11) (Table 2). It should be noted that reproducibility within a laboratory can be very high, with sizes being reported to a high degree of precision. Although our samples were small, when the same allele was repeatedly typed (2 to 17 times) by the same CE machine, standard deviations were low (with a mean of 0.1 bases and a range of 0.0 to 0.5 bases for one machine and a mean of 0.2 bases and a range of 0.0 to 1.0 bases for the other machine).

Table 2 shows that fragment sizes obtained by capillary electrophoresis, in addition to being different from the sequenced sizes, differ between the three machines, with differences reaching 5.1 bases. In general, the size difference is usually consistently smaller or larger for a specific marker between two machines, though this is not the case for the STRAf 3B marker and the Warwick and Imperial machines. The average differences between the sizes obtained from the de Valk et al. study and the Warwick machine, the de Valk et al. study and the Imperial machine, and the Warwick and Imperial machines are 1.1, 2.8, and 1.8 bases, respectively. These differences are also illustrated in Fig. 1, where lines of best fit are shown for the correlations between allele sizes obtained by sequencing and those obtained from the three CE machines. It might be thought that the singular factor affecting the electrophoretic mobility of a DNA molecule during CE is its sequence composition. However, if this were true, then the slopes of the lines for each marker on all three machines would be identical or very similar. In fact, the best-fit lines for marker STRAf 3C differed very little in slope (difference = 0.01), but the difference in the slopes for marker STRAf 4A was considerably larger (difference = 0.05). As the slopes are not the same, factors other than sequence composition must be important. These factors would include the actual machine used as well as the effects of specific running conditions.

FIG. 1.

FIG. 1.

Graphs of called sizes (by CE) versus actual sizes (by sequencing) for all six markers. Lines of best fit by linear regression are shown. Correlations (r2 values) were greater than 0.999 in all cases.

For those markers and machines for which the slope of the line is 1.00, it should be possible to apply a consistent correction factor in order to arrive at the actual sizes (as determined by sequencing). Thus, for instance, with STRAf 4B and the Imperial machine, a consistent correction factor of +2.1 bases could be applied. Indeed, for some markers and machines, it is possible that no correction factor might be required (e.g., STRAf 4B and the Warwick machine and STRAf 4C and the Imperial machine). Finally, the application of a consistent correction factor would not be possible for some markers and machines, since the size difference can increase with larger fragments (e.g., STRAf 3B and all three machines), can decrease with larger fragments (e.g., STRAf 3C and the Imperial machine), or indeed can go from a positive correction value to a negative one as fragment size increases (e.g., STRAf 3C and the Warwick machine). In these instances, the line-of-best-fit equation can be used to correct CE values.

DISCUSSION

Any new typing system will have potential problems that can range from technical issues with the technology to issues associated with the specific markers developed in the study. Nearly all studies will discuss whether or not their markers are sufficiently discriminatory to address specific biological questions but will often state that the technology used is reproducible and readily transferable between laboratories. Lack of reproducibility has been a problem with older typing techniques, such as restriction fragment length polymorphism and random amplification of polymorphic DNA analyses, which has resulted in these techniques being less commonly used nowadays. The newer PCR-based techniques of MLST and MLMT are generally assumed to be highly reproducible once PCR conditions have been optimized. In addition, as with any experiment, other sources of error should always be considered, especially human ones, such as the mislabeling of samples (12, 32). A few genotyping studies where microsatellite length polymorphisms were employed have attempted to quantify the sizes and natures of many different types of error (6, 12, 16, 40).

Although other technologies are employed in MLMT studies, such as separating and sizing fragments on high-percent-agarose gels, commercially available sequence analyzers are most frequently used. However, as one of the manufacturers of these machines points out, “A common misconception about DNA fragment-sizing is that the calculated size of a DNA fragment is equivalent to the length of the fragment. Because the electrophoretic mobility of DNA is sequence-dependent, DNA fragments of the same length can have different mobilities and, therefore, can vary in calculated size” (2). The authors of most MLMT systems developed for microorganisms, including those of the two systems developed for A. fumigatus (4, 11), fail to mention this discordance, which occurs with both SE and CE machines. In contrast, Schouls et al. (36) state that the addition of an extra repeat to the calculated number was required to obtain the true number of repeats by sequencing. Dalle et al. (8) reported a 7-base discrepancy for one of their alleles, Maho et al. (28) reported differences of up to 3 bases, and Foulet et al. (15) reported an example of a 2-base difference. The most dramatic examples of this discordance have been provided by Lista et al. (27), where 25 loci were used to type Bacillus anthracis strains. They found differences of up to 8, 11, and 17 bases for three of their markers based on repeat units of 9 nucleotides. Finally, Farlow et al. (13) stated that they sequenced at least one allele for each locus because of the lack of agreement between sequenced size and size obtained by an SE machine, and Keim et al. (23) stated that they determined the actual sequences of most alleles because of differences of 1 or 2 nucleotides. Keim et al. (23) and Lista et al. (27) actually provided reasons for these discordances and mentioned DNA conformational differences (though these should be minimal under denaturing conditions), migrational deviations of the size standard, the nature of the gel matrix used, and sequence composition. It should be noted that in three of these studies, this problem was mentioned only within Materials and Methods.

We were able to find only two MLMT studies published in the microbiology literature that discussed the fact that different machines can generate different sizes for the same allele. Even in these cases, the differences were inferred to be caused by switches from SE machines to CE machines (8) or were hinted at rather than stated directly (27). In contrast, we in this study, and others outside the field of microbiology, have shown that different CE machines and running conditions will affect the sizing of alleles (10, 25, 33, 38, 40, 41). Another microbiology study made the very general statement that separation techniques have been shown to produce different results for the same locus and therefore that special care must be taken when standardizing typing data (34).

Because these technical limitations are overlooked, MLMT using SE or CE machines is “sold” as a technology that is highly reproducible and readily portable between laboratories. It has been presented as such even in those studies that have pointed out these problems, where phrases such as “represents a robust and easily transferable approach” (23), “are easily compared to data generated at dispersed laboratories” (13), and “yields unambiguous numeric profiles that can easily be electronically exchanged” (36) have been used. Only one of these studies emphasized caution, stating that laboratories using different machines and reagents need to correct their sizes before alleles can be called and datasets merged (27). Where techniques and technologies have been compared for typing specific organisms, in this example, Aspergillus fumigatus, the only technical problems mentioned in relation to the use of SE or CE machines were those associated with the PCR itself (terminal transferase activity of Taq DNA polymerase and stutter bands), which are usually readily spotted on electrophoretic traces (5, 26). We feel that these omissions regarding the discordance between sizes obtained by sequencing and sizes obtained by machine, and machine-to-machine variation, are remiss and have the potential to mislead microbiologists (as we were) into thinking mistakenly that it is straightforward to set up an open-access MLMT database that could be used by many laboratories.

There are other technical problems of which potential users should be made aware (33). Switching to another fluorescent label can alter the mobility of a DNA molecule by ±0.5 bases (10), and temperature fluctuations in the laboratory housing the machine can alter mobility by up to 0.7 bases, with a temperature difference of 5°C (9). Other, potentially more serious problems include migrational deviations of the internal size standard, and we have identified one study which used an internal size standard that has been recommended not to be employed with CE machines (1, 38). The calibration method used is also important, and the local Southern approach is recommended (16). A possible major problem with diploid organisms is “allelic dropout,” where amplification of a smaller allele is favored over that of a larger allele, with the consequence that an individual is scored as being homozygous when, in fact, it is heterozygous (6, 12, 20). Again, we could find no mention of this problem within any studies of diploid microorganisms, though one study stated that heterozygosity may be underestimated because of the existence of null alleles, where mutations in the primer binding site cause the PCR to fail (31, 35).

What strategies can be employed, then, to calibrate alleles for a specific marker and to correct for interlaboratory differences? Even if the results are very precise when the determinations are repeated in the same laboratory, it is important to distinguish “called” sizes (obtained from SE or CE machines) from actual sizes, since exact sizes can be determined only by sequencing. The actual sizes of alleles would be those recorded in any international database setup. By far, the best approach is to develop marker-specific size ladders which contain all the common alleles for a given locus. This is the approach used in forensic medicine, and it is not an unrealistic approach for molecular typing, as each system usually involves only a few loci (17, 18, 25). Ideally, a ladder would be developed for every locus in a given system. Some studies have stated that the inclusion of controls for which the allele sizes are known is required (6, 8, 9, 16, 23, 27, 33); however, this relies on the assumption that the size differences are consistent across the entire size range of those loci (21, 33, 40). The inclusion of at least one control is, of course, sensible to ensure intralaboratory consistency in the reporting of allele sizes (4-7, 15, 16, 22, 39, 40). The frequency distribution of alleles can also be used to ensure that data are consistent between laboratories (33). If the option of marker-specific size ladders is not available, then each participating laboratory will have to carry out an in-house calibration by sequencing and determining the actual sizes of a range of alleles, in the manner that we have employed, and to use these to enable a correction factor to be applied for the specific machine and running conditions used in that laboratory. This determination of correction factors is not, however, a straightforward task, as determining the length of long repeats by sequencing can be difficult because slippage of the Taq DNA polymerase results in double (and triple) peaks in the electrophoretic trace toward the end of the repeat. Peak size can also drop off precipitously toward the end of the repeat, as dideoxy terminators are rapidly depleted.

In conclusion, accurate and reproducible sizing of DNA molecules is essential for typing with microsatellites. The problem of sizing errors seems to have received little attention in the microbiology literature, and we believe that authors have a duty to highlight potential pitfalls when describing any new typing system.

Acknowledgments

A.C.P. was sponsored by CAPES (Brazilian government), and M.J.A. was supported by a Wellcome Trust (United Kingdom) program grant. This project was funded by the Fungal Research Trust (United Kingdom) and the Hospital Infection Society (United Kingdom).

We thank Helen Bird and Lesley Ward at The University of Warwick and Rachel Emerson and Wanda Stow at Imperial College for processing the CE samples and for their advice and assistance. We thank Mark Bond at The University of Manchester for processing the sequencing samples and for his advice.

Footnotes

Published ahead of print on 13 December 2006.

REFERENCES

  • 1.Ajzenberg, D., A.-L. Bañuls, M. Tibayrenc, and M. L. Dardé. 2002. Microsatellite analysis of Toxoplasma gondii shows considerable polymorphism structured into two main clonal groups. Int. J. Parasitol. 32:27-38. [DOI] [PubMed] [Google Scholar]
  • 2.Applied Biosystems. 2004, posting date. Microsatellite analysis on the Applied Biosytems 3130 Series genetic analyzers. http://docs.appliedbiosystems.com/pebiodocs/00113901.pdf.
  • 3.Balajee, S. A., J. L. Gribskov, E. Hanley, D. Nickle, and K. A. Marr. 2005. Aspergillus lentulus sp. nov., a new sibling species of A. fumigatus. Eukaryot. Cell 4:625-632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bart-Delabesse, E., J.-F. Humbert, E. Delabesse, and S. Bretagne. 1998. Microsatellite markers for typing Aspergillus fumigatus isolates. J. Clin. Microbiol. 36:2413-2418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bart-Delabesse, E., J. Sarfati, J.-P. Debeaupuis, W. van Leeuwen, A. van Belkum, S. Bretagne, and J.-P. Latgé. 2001. Comparison of restriction fragment length polymorphism, microsatellite length polymorphism, and random amplification of polymorphic DNA analyses for fingerprinting Aspergillus fumigatus isolates. J. Clin. Microbiol. 39:2683-2686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bonin, A., E. Bellemain, P. Bronken Eidesen, F. Pompanon, C. Brochmann, and P. Taberlet. 2004. How to track and assess genotyping errors in population genetics studies. Mol. Ecol. 13:3261-3273. [DOI] [PubMed] [Google Scholar]
  • 7.Botterel, F., C. Desterke, C. Costa, and S. Bretagne. 2001. Analysis of microsatellite markers of Candida albicans used for rapid typing. J. Clin. Microbiol. 39:4076-4081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Dalle, F., N. Franco, J. Lopez, O. Vagner, D. Caillot, P. Chavanet, B. Cuisenier, S. Aho, S. Lizard, and A. Bonnin. 2000. Comparative genotyping of Candida albicans bloodstream and nonbloodstream isolates at a polymorphic microsatellite locus. J. Clin. Microbiol. 38:4554-4559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Davison, A., and S. Chiba. 2003. Laboratory temperature variation is a previously unrecognized source of genotyping error during capillary electrophoresis. Mol. Ecol. Notes 3:321-323. [Google Scholar]
  • 10.Delmotte, F., N. Leterme, and J.-C. Simon. 2001. Microsatellite allele sizing: difference between automated capillary electrophoresis and manual technique. BioTechniques 31:810-818. [PubMed] [Google Scholar]
  • 11.de Valk, H. A., J. F. G. M. Meis, I. M. Curfs, K. Muehlethaler, J. W. Mouton, and C. H. W. Klaassen. 2005. Use of a novel panel of nine short tandem repeats for exact and high-resolution fingerprinting of Aspergillus fumigatus isolates. J. Clin. Microbiol. 43:4112-4120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ewen, K. R., M. Bahlo, S. A. Treloar, D. F. Levinson, B. Mowry, J. W. Barlow, and S. J. Foote. 2000. Identification and analysis of error types in high-throughput genotyping. Am. J. Hum. Genet. 67:727-736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Farlow, J., K. L. Smith, J. Wong, M. Abrams, M. Lytle, and P. Keim. 2001. Francisella tularensis strain typing using multiple-locus, variable-number tandem repeat analysis. J. Clin. Microbiol. 39:3186-3192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Fisher, M. C., S. De Hoog, and N. Vanittanakom. 2004. A highly discriminatory multilocus microsatellite typing (MLMT) system for Penicillium marneffei. Mol. Ecol. Notes 4:515-518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Foulet, F., N. Nicolas, O. Eloy, F. Botterel, J.-C. Gantier, J.-M. Costa, and S. Bretagne. 2005. Microsatellite marker analysis as a typing system for Candida glabrata. J. Clin. Microbiol. 43:4574-4579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Ghosh, S., Z. E. Karanjawala, E. R. Hauser, D. Ally, J. I. Knapp, J. B. Rayman, A. Musick, J. Tannenbaum, C. Te, S. Shapiro, W. Eldridge, T. Musick, C. Martin, J. R. Smith, J. D. Carpten, M. J. Brownstein, J. I. Powell, R. Whiten, P. Chines, S. J. Nylund, V. L. Magnuson, M. Boehnke, F. S. Collins, and the FUSION (Finland-U.S. Investigation of NNIDM Genetics) Study Group. 1997. Methods for precise sizing, automated binning of alleles, and reduction of error rates in large-scale genotyping using fluorescently labeled dinucleotide markers. Genome Res. 7:165-178. [DOI] [PubMed] [Google Scholar]
  • 17.Gill, P., C. Kimpton, E. D'Aloja, J. F. Andersen, W. Bar, B. Brinkmann, S. Holgersson, V. Johnsson, A. D. Kloosterman, M. V. Lareu, L. Nellemann, H. Pfitzinger, C. P. Phillips, H. Schmitter, P. M. Schneider, and M. Stenersen. 1994. Report of the European DNA profiling group (EDNAP)—towards standardisation of short tandem repeat (STR) loci. Forensic Sci. Int. 65:51-59. [DOI] [PubMed] [Google Scholar]
  • 18.Griffiths, R. A. L., M. D. Barber, P. E. Johnson, S. M. Gillbard, M. D. Haywood, C. D. Smith, J. Arnold, T. Burke, A. J. Urquhart, and P. Gill. 1998. New reference allelic ladders to improve allelic designation in a multiplex STR system. Int. J. Legal Med. 111:267-272. [DOI] [PubMed] [Google Scholar]
  • 19.Groathouse, N. A., B. Rivoire, H. Kim, H. Lee, S.-N. Cho, P. J. Brennan, and V. D. Vissa. 2004. Multiple polymorphic loci for molecular typing of strains of Mycobacterium leprae. J. Clin. Microbiol. 42:1666-1672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hoffman, J. I., and W. Amos. 2005. Microsatellite genotyping errors: detection approaches, common sources and consequences for paternal exclusion. Mol. Ecol. 14:599-612. [DOI] [PubMed] [Google Scholar]
  • 21.Idury, R. M., and L. R. Cardon. 1997. A simple method for automated allele binning in microsatellite markers. Genome Res. 7:1104-1109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Johansson, A., J. Farlow, P. Larsson, M. Dukerich, E. Chambers, M. Byström, J. Fox, M. Chu, M. Forsman, A. Sjöstedt, and P. Keim. 2004. Worldwide genetic relationships among Francisella tularensis isolates determined by multiple-locus variable-number tandem repeat analysis. J. Bacteriol. 186:5808-5818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Keim, P., L. B. Price, A. M. Klevytska, K. L. Smith, J. M. Schupp, R. Okinaka, P. J. Jackson, and M. E. Hugh-Jones. 2000. Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within Bacillus anthracis. J. Bacteriol. 182:2928-2936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Klevytska, A. M., L. B. Price, J. M. Schupp, P. L. Worsham, J. Wong, and P. Keim. 2001. Identification and characterization of variable-number tandem repeats in the Yersinia pestis genome. J. Clin. Microbiol. 39:3179-3185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.LaHood, E. S., P. Moran, J. Olsen, W. S. Grant, and L. K. Park. 2002. Microsatellite allele ladders in two species of Pacific salmon: preparation and field-test results. Mol. Ecol. Notes 2:187-190. [Google Scholar]
  • 26.Lasker, B. A. 2002. Evaluation of performance of four genotypic methods for studying the genetic epidemiology of Aspergillus fumigatus isolates. J. Clin. Microbiol. 40:2886-2892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Lista, F., G. Faggioni, S. Valjevac, A. Ciammaruconi, J. Vaissaire, C. le Doujet, O. Gorgé, R. De Santis, A. Carattoli, A. Ciervo, A. Fasanella, F. Orsini, R. D'Amelio, C. Pourcel, A. Cassone, and G. Vergnaud. 2006. Genotyping of Bacillus anthracis strains based on automated capillary 25-loci multiple locus variable-number tandem repeats analysis. BMC Microbiol. 6:33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Maho, A., A. Rossano, H. Hächler, A. Holzer, E. Schelling, J. Zinsstag, M. H. Hassane, B. S. Toguebaye, A. J. Akakpo, M. Van Ert, P. Keim, L. Kenefic, J. Frey, and V. Perreten. 2006. Antibiotic susceptibility and molecular diversity of Bacillus anthracis strains in Chad: detection of a new phylogenetic subgroup. J. Clin. Microbiol. 44:3422-3425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Maiden, M. C. J., J. A. Bygraves, E. Feil, G. Morelli, J. E. Russell, R. Urwin, Q. Zhang, J. Zhou, K. Zurth, D. A. Caugant, I. M. Feavers, M. Achtman, and B. G. Spratt. 1998. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc. Natl. Acad. Sci. USA 95:3140-3145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Nierman, W. C., A. Pain, M. J. Anderson, J. R. Wortman, H. S. Kim, J. Arroyo, M. Berriman, K. Abe, D. B. Archer, C. Bermejo, J. Bennett, P. Bowyer, D. Chen, M. Collins, R. Coulsen, R. Davies, P. S. Dyer, M. Farman, N. Fedorova, N. Fedorova, T. V. Feldblyum, R. Fischer, N. Fosker, A. Fraser, J. L. Garcia, M. J. García, A. Goble, G. H. Goldman, K. Gomi, S. Griffith-Jones, R. Gwilliam, B. Haas, H. Haas, D. Harris, H. Horiuchi, J. Huang, S. Humphray, J. Jiménez, N. Keller, H. Khouri, K. Kitamoto, T. Kobayashi, S. Konzack, R. Kulkarni, T. Kumagai, A. Lafon, J.-P. Latgé, W. Li, A. Lord, C. Lu, W. H. Majoros, G. S. May, B. L. Miller, Y. Mohamoud, M. Molina, M. Monod, I. Mouyna, S. Mulligan, L. Murphy, S. O'Neil, I. Paulsen, M. A. Peñalva, M. Pertea, C. Price, B. L. Pritchard, M. A. Quail, E. Rabbinowitsch, N. Rawlins, M. A. Rajandream, U. Reichard, H. Renauld, G. D. Robson, S. Rodriguez de Córdoba, J. M. Rodríguez-Peña, C. M. Ronning, S. Rutter, S. L. Salzberg, M. Sanchez, J. C. Sánchez-Ferrero, D. Saunders, K. Seeger, R. Squares, S. Squares, M. Takeuchi, F. Tekaia, G. Turner, C. R. Vazquez de Aldana, J. Weidman, O. White, J. Woodward, J. H. Yu, C. Fraser, J. E. Galagan, K. Asai, M. Machida, N. Hall, B. Barrell, and D. W. Denning. 2005. Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus. Nature 438:1151-1156. [DOI] [PubMed] [Google Scholar]
  • 31.Ochsenreither, S., K. Kuhls, M. Schaar, W. Presber, and G. Schönian. 2006. Multilocus microsatellite typing as a new tool for discrimination of Leishmania infantum MON-1 strains. J. Clin. Microbiol. 44:495-503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Odds, F. C., A. D. Davidson, M. D. Jacobsen, A. Tavanti, J. A. Whyte, C. C. Kibbler, D. H. Ellis, M. C. J. Maiden, D. J. Shaw, and N. A. R. Gow. 2006. Candida albicans strain maintenance, replacement, and microvariation demonstrated by multilocus sequence typing. J. Clin. Microbiol. 44:3647-3658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32a.Pasqualotto, A. C., D. W. Denning, and M. J. Anderson. 2005. Multilocus microsatellite typing (MLMT) of Aspergillus fumigatus: problems with sizing repeats and development of an alternative system using longer repeats. Mycoses 48(Suppl. 2):viii. [Google Scholar]
  • 33.Presson, A., E. Sobel, K. Lange, and J. Papp. 2006. Merging microsatellite data. J. Comput. Biol. 13:1131-1147. [DOI] [PubMed] [Google Scholar]
  • 34.Sampaio, P., L. Gusmão, C. Alves, C. Pina-Vaz, A. Amorim, and C. Pais. 2003. Highly polymorphic microsatellite for identification of Candida albicans strains. J. Clin. Microbiol. 41:552-557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Sampaio, P., L. Gusmão, A. Correia, C. Alves, A. G. Rodrigues, C. Pina-Vaz, A. Amorim, and C. Pais. 2005. New microsatellite multiplex PCR for Candida albicans strain typing reveals microevolutionary changes. J. Clin. Microbiol. 43:3869-3876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Schouls, L. M., A. van der Ende, M. Damen, and I. van de Pol. 2006. Multiple-locus variable-number tandem repeat analysis of Neisseria meningitidis yields groupings similar to those obtained by multilocus sequence typing. J. Clin. Microbiol. 44:1509-1518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Smith, K. L., V. De Vos, H. B. Bryden, M. E. Hugh-Jones, A. Klevytska, L. B. Price, P. Keim, and D. T. Scholl. 1999. Meso-scale ecology of anthrax in southern Africa: a pilot study of diversity and clustering. J. Appl. Microbiol. 87:204-207. [DOI] [PubMed] [Google Scholar]
  • 38.Vainer, M., S. Enad, V. Dolnik, D. Xu, J. Bashkin, M. Marsh, O. Tu, D. W. Harris, D. L. Barker, and E. S. Mansfield. 1997. Short tandem repeat typing by capillary array electrophoresis: comparison of sizing accuracy and precision using different buffer systems. Genomics 41:1-9. [DOI] [PubMed] [Google Scholar]
  • 39.Walker, A., S. J. Petheram, L. Ballard, J. R. Murph, G. J. Demmler, and J. F. Bale, Jr. 2001. Characterization of human cytomegalovirus strains by analysis of short tandem repeat polymorphisms. J. Clin. Microbiol. 39:2219-2226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Weeks, D. E., Y. P. Conley, R. E. Ferrell, T. S. Mah, and M. B. Gorin. 2002. A tale of two genotypes: consistency between two high-throughput genotyping centers. Genome Res. 12:430-435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Wenz, H.-M., J. M. Robertson, S. Menchen, F. Oaks, D. M. Demorest, D. Scheibler, B. B. Rosenblum, C. Wike, D. A. Gilbert, and J. W. Efcavitch. 1998. High-precision genotyping by denaturing capillary electrophoresis. Genome Res. 8:69-80. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Clinical Microbiology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES