Skip to main content
Applied and Environmental Microbiology logoLink to Applied and Environmental Microbiology
. 2005 Apr;71(4):2026–2035. doi: 10.1128/AEM.71.4.2026-2035.2005

Improved Assessment of Denitrifying, N2-Fixing, and Total-Community Bacteria by Terminal Restriction Fragment Length Polymorphism Analysis Using Multiple Restriction Enzymes

Christopher Rösch 1, Hermann Bothe 1,*
PMCID: PMC1082561  PMID: 15812035

Abstract

A database of terminal restriction fragments (tRFs) of the 16S rRNA gene was set up utilizing 13 restriction enzymes and 17,327 GenBank sequences. A computer program, termed TReFID, was developed to allow identification of any of these 17,327 sequences by means of polygons generated from the specific tRFs of each bacterium. The TReFID program complements and exceeds in its data content the Web-based phylogenetic assignment tool recently described by A. D. Kent, D. J. Smith, B. J. Benson, and E. W. Triplett (Appl. Environ. Microb. 69:6768-6766, 2003). The method to identify bacteria is different, as is the region of the 16S rRNA gene employed in the present program. For the present communication the software of the tRF profiles has also been extended to allow screening for genes coding for N2 fixation (nifH) and denitrification (nosZ) in any bacterium or environmental sample. A number of controls were performed to test the reliability of the TReFID program. Furthermore, the TReFID program has been shown to permit the analysis of the bacterial population structure of bacteria by means of their 16S rRNA, nifH, and nosZ gene content in an environmental habitat, as exemplified for a sample from a forest soil. The use of the TReFID program reveals that noncultured denitrifying and dinitrogen-fixing bacteria might play a more dominant role in soils than believed hitherto.


The characterization of bacterial communities in environments is complicated by extreme species richness and a high variability of distribution within short distances. Soils are estimated to harbor up to 1010 bacteria of about 104 different ribotypes per g (3, 19), of which more than 95% cannot be cultured by present methods (2, 11). However, knowledge about microbial community structure and its variations is required to predict nutrient fluxes and to analyze the effects of xenobiotica. In a more recent avenue, molecular methods have been introduced to get access to the noncultured population of bacteria in environmental samples. Most approaches published so far either rely on DNA sequencing of clone libraries (16, 21) or employ fingerprinting techniques (for example, denaturing gradient gel electrophoresis [DGGE] [12] or terminal restriction fragment length polymorphism [tRFLP] analysis [9]) to demonstrate microbial diversity or population shifts in microbial communities. Clone libraries supply novel sequence information and allow the phylogenetic identification of individual clones, but this approach is laborious and expensive. PCR-based fingerprinting techniques such as DGGE or tRFLP (9) primarily provide population-specific signatures. DGGE bands can be sequenced but usually yield relatively short sequences. Although tRFLP bands cannot be sequenced, this method is preferred to DGGE, because tRFLP allows analysis of population structures of more complex communities. However, since individual peaks of a tRFLP profile determined using a single restriction enzyme can be derived from a whole range of nonrelated species in many cases, tRFLP is used to characterize shifts in a population structure but not to identify individual strains in a community. To circumvent this limitation, a method has recently been introduced to produce a species list for a bacterial population of unknown composition on the basis of experimental tRFLP data from separate digests of multiple different restriction enzymes (6). This phylogenetic tool is based on analysis of fluorescence-labeled terminal restriction fragments of 16S rRNA gene segments. Independently of that work (6), this laboratory generated software by a similar approach that also employed the 16S rRNA gene; however, this approach uses a different sequence area which is present in most sequences deposited in the databanks. For this computer program, termed TReFID, data from up to 13 different restriction enzymes were used and a unique algorithm was developed. This computer program allows a statistical analysis of experimental data and comparison with reference sequences from public databases.

In addition to this phylogenetic assignment method for the analysis of tRFLP profiles of the 16S rRNA gene, the present work describes, for the first time, the development of a similar tool for N2 fixation and denitrification. For N2 fixation, nifH coding for dinitrogenase reductase was chosen because this gene is highly conserved among microorganisms (22) and because some 2,000 sequences are deposited in GenBank. For denitrification, nosZ, encoding nitrous oxide reductase, with only about 180 entries in GenBank, was selected. A further disadvantage of this gene is that it does not occur in denitrifying bacteria which do not reduce N2O. However, this gene had to be chosen because all of the other steps of denitrification (reduction of nitrate to nitrite, nitrite to nitric oxide, and nitric oxide to nitrous oxide) are catalyzed by at least two different enzymes in each case. Moreover, the sequence information for all these genes is even scarcer than that available for nitrous oxide reductase.

To demonstrate the potential of the present approach, the bacterial population of a forest soil from the vicinity of Cologne (Germany) was analyzed for its content with respect to the 16S rRNA, nifH, and nosZ genes. For this goal, a soil was selected with a high carbon/nitrogen ratio, which led us to assume that both denitrifying and N2-fixing bacteria occurred there simultaneously and with a high abundance. DNA was extracted from samples of the soil, and segments of the three genes were amplified by PCR and subjected to tRFLP analysis. Additionally, PCR products generated from the DNA of the forest soil were cloned and sequenced. The species list obtained by the TReFID software was compared with the sequence data from the clone libraries. Data were also evaluated by various control procedures which will be shown in detail. It is suggested that tRFLP analysis employing multiple restriction enzymes will become a useful tool to analyze complex microbial communities and one which will improve as more sequence information for the different enzymes becomes available.

MATERIALS AND METHODS

Soil samples used and techniques for the molecular characterization of the DNA.

The soil analyzed for its bacterial content came from Dünnwald forest (51°01′00"N, 07°03′47"E) in the suburban area of Cologne, Germany. This loamy sand soil had a pH of 3.9 and a content of water of about 10% (wt/wt) and of humus of 4 to 6% (wt/wt) on the sampling date of 2 October 2002. The high C/N ratio of 18.3 to 19.1 was particularly noteworthy. The total N content was 0.28 to 0.30% (wt/wt). Other data obtained with a solution extracted from the soil were as follows: for NH4+, 1.95 μg/ml; for NO3, 17.35 μg/ml; for P, 0.9 ppm; for S, 10.9 ppm; for Fe, 2.2 ppm; for Al, 3.1 ppm; for K, 16.5 ppm; for Mg, 2.3 ppm; for Mn, 0.32 ppm; and for Na, 7.4 ppm (n = 2).

Triplicate soil samples were taken from two 4-m2 patches in October 2002. Soil cores from the upper 20 cm, excluding nondecomposed litter, were homogenized. Total DNA was then extracted using an UltraClean Soil DNA kit (MoBio, Solana Beach, Calif.). The DNA preparation obtained was used as a template for amplifying the 16S rRNA, nifH, and nosZ genes by PCR. For the 16S rRNA gene, the primers 63F (10) and 778R (AGG GTA TCT AAT CCT GTT TGC) were routinely used for tRFLP analysis. This new primer, 778R, was tested with both laboratory cultures and clone libraries and provided amplicons from a wide range of organisms (C. Rösch, diploma thesis). To construct clone libraries, the additional primer combinations 27F-1495R (20) and 63F-1387R (10) were employed. Segments of nifH for tRFLP were amplified with the following primers (wobbling bases are underlined): nifHF (AAA GGY GGW ATC GGY AAR TCC ACC AC) and nifHRb (TGS GCY TTG TCY TCR CGG ATB GGC AT). For the clone libraries, the alternative reverse primer nifHR (ATG ATG GCS ATG TAY GCS GCS AAC AA) or nifHRc (TGG GCY TTG TTY TCR CGG ATY GGC AT) was used. The nosZ segments were obtained using nosZFb (AAC GCC TAY ACS ACS CTG TTC) and nosZRb (TCC ATG TGC AGN GCR TGG CAG AA). The choice of the primers was as described in reference 16; however, some minor modifications were made for improvements. PCR products of nifH and nosZ were about 400 and 700 bp in length, respectively. Hot start PCRs in a 25-μl volume were performed using a MasterTaq kit (Eppendorf, Hamburg, Germany) followed by touch-down time programs of 40 cycles in a Personal Cycler (Biometra, Göttingen, Germany). Annealing temperatures decreased stepwise from 66 to 56°C for the 16S rRNA gene amplifications and from 65 to 50°C in the case of nifH and nosZ.

To construct a clone library, PCR products were purified with a MinElute gel extraction kit (QIAGEN, Hilden, Germany), cloned using a pGEMT Easy Vector system (Promega, Mannheim, Germany), and sequenced with a BigDye Terminator cycle sequencing kit version 1.1 (Applied Biosystems, Weiterstadt, Germany) and an ABI 3100 automatic sequencer (Applied Biosystems). Raw sequences were processed in BioEdit 5.0.9 (5) and verified by BlastN (1), ClustalX alignments (18), and ChimeraCheck (8). The phylogenetic affiliation of the novel clones was deduced by means of the RDP II Phylip Interface (http://rdp.cme.msu.edu/cgis/phylip.cgi). Additionally, neighbor-joining phylogenies of 100 replicate trees were constructed with ClustalX and visualized with TreeView 1.6.1 (14). Gap positions were excluded from the analysis, but corrections for multiple substitutions were applied. Corrected sequences were deposited in GenBank (www.ncbi.nlm.nih.gov [accession no. AY723961 to AY724250]).

For tRFLP analyses, 5′ fluorochrome-labeled PCR primers were used: 63F-6-carboxyfluorescein or 63F-6-carboxy-4,5-dichloro-2′,7′-dimethoxyfluorescein (JOE) for the 16S rRNA gene and nifHF-6-carboxyfluorescein, or nosZR-6-carboxytetramethylrhodamine (MWG Biotech, Ebersberg, Germany). Purified PCR products from a reaction with nonlabeled primers were reamplified in a second PCR with labeled primers up to a total volume of 1,000 μl of PCR product. Products of this second reaction were purified (QiaQuick gel extraction kit [QIAGEN] or Ultrafree-MC columns [Millipore, Bedford, Mass.]), which also showed that PCR products had been obtained in sufficient quantities. The preparation was then partitioned into up to 13 aliquots and digested overnight at 37°C in a 100-μl volume by separately using one of the following restriction endonucleases (MBI Fermentas, Leon-Rot, Germany) per tube: AluI (AG/CT), Bme1390I (CC/NGG), Bsh1236I (CG/CG), Cfr13I (G/GNCC), HaeIII (GG/CC), Hin6I (C/GCG), HinfI (G/ANTC), MboI (/GATC), MspI (C/CGG), or RsaI (GT/AC). Digestions with TaiI (ACGT/), TacI (T/CGA), and TasI (/AATT) were performed at 65°C. After ethanol precipitation, the fragment mixtures were dissolved in 10 μl of sterile deionized water. Prior to gel loading, the samples (2.5 μl) were mixed with formamide (1.0 μl), loading buffer (Applied Biosystems) (1.0 μl), and a GeneScan 500 ROX size standard (Applied Biosystems) (0.5 μl). The analysis of 1.6 μl of this mixture was performed by 3 h of electrophoresis on a 36-cm-long 4.5% polyacrylamide gel at 2,700 V (ABI 377 automatic sequencer equipped with GeneScan 3.1.2; Applied Biosystems). The sizes of fragments were determined by using the local Southern method implemented in GeneScan 3.1.2 and a GS-500 ROX size standard. Only fragment lengths in the range of 30 to 500 nucleotides (nt) were considered for analysis. The noise threshold for signal processing was set as low as possible (generally about 20 relative fluorescence units) to cover a broad range of prominent and weak signals at the same time. Thus, peak heights for single tRFLP profiles were distributed over 2 orders of magnitude (∼20 to ∼2,000 relative fluorescence units). The results were exported as tabulated-delimited text files (GeneScan 3.1.2) and further processed with TReFID (Fig. 1).

FIG. 1.

FIG. 1.

A flow chart describing the steps in the identification procedure of the TReFID program.

Software development.

The source code of the data analysis software, TReFID, was written in PureBasic 3.80 and compiled to FASM assembler language. TReFID is a small stand-alone executable file for Microsoft Windows XP. It can be obtained as a ZIP file from http://www.trefid.net. The reference database is selected at the program start. These data are sorted and displayed in three different ways. In a first tree-view control, all species contained in the selected database are displayed in alphabetical order. A second tree-view control summarizes all entries according to their phylogenies (based on Bergey's Manual 2001). The third tree-view control arranges the entries by their expected terminal restriction fragments (tRFs) for each restriction enzyme in use. As examples, databases are provided for the 16S rRNA (63F), nifH (nifHF), and nosZ (nosZR) genes. Currently, the only operating system supported is Microsoft Windows XP.

Preparation of the databases.

DNA sequence data for the three genes examined (the 16S rRNA, nifH, and nosZ genes) were downloaded in GenBank format by use of the Entrez nucleotide database query form (www.ncbi.nlm.nih.gov). In the cases of nifH and nosZ, all available sequences (October 2003) were analyzed using multiple sequence alignments (ClustalX 1.81) (18). Sequences including the binding sites for the fluorochrome-labeled primers (nifHF and nosZR) were incorporated in the respective TReFID databases. Those sequences had to be modified to fit the 5′ end of the primer for nifHF or the 3′ end of that for nosZR. Overhang nucleotides were cut, while missing nucleotides (up to 30) in the primer binding site were completed with “N” to assure correct fragment sizes. This modification was automatically accomplished by use of our own GBSD program (available under http://www.trefid.net) and a table of the number of nucleotides (up to 30) to be removed from or added to the reference sequences. Alignments of all sequences were manually done with ClustalX. Only 16S rRNA gene sequences deposited before January 2003 were used and similarly analyzed. A single GenBank sequence can be characterized by its set of theoretically derived tRFs. A graphical representation of this would be a polygon in a spider web graph (Fig. 2), where each axis represents a different restriction enzyme; the respective fragment sizes are dots on these axes, the origin being in the center.

FIG. 2.

FIG. 2.

Graphical representation of the tRFs obtained for nifH of N2-fixing bacteria and nosZ of denitrifying microorganisms. (I) For nifH, tRFs derived from the sequences deposited in GenBank (a) and tRFs experimentally obtained from the DNA isolated from the Dünnwald soil (b) are shown. (II) For nosZ, tRFs derived from the sequences deposited in GenBank (a) and tRFs experimentally obtained from the DNA isolated from the Dünnwald soil (b) are shown. The shaded polygon symbolizes the tRFs from Azospirillum brasilense Sp7. The lengths of the tRFs are given on the axes in the range between 0 and 500 bp.

Data analysis.

To produce a species list with the TReFID software, both the desired database and the experimental tRFLP data have to be loaded. Two analysis parameters can be set by the user to allow fine tuning. (i) The lengths of tRFs determined mathematically can differ from those of the experimentally determined tRFs obtained from the DNA of an environmental sample (see Results). Therefore, the latter are not treated as discrete values by TReFID but as intervals with score values of 1.0, 0.5, or 0.25. (ii) The tRFs obtained from the DNA of an environmental sample have to match the tRF polygon of a sequence deposited in the TReFID data bank. The minimum number of tRFs which have to be retrieved from a polygon is two of three (i.e., 8 enzymes with matches out of total 12 employed, for example). This is defined as the threshold value (see Results). Enzymes with expected fragments more than 500 nt in length were excluded from the analysis of a sequence. Predicted tRFs less than 30 nt in length were allowed when a restriction site was located inside the primer binding site. In this case, the second restriction site, located outside the primer sequence, was used for the TReFID database and also for the analysis of the sample DNA. Those tRFs of a DNA to be analyzed in cases in which ≥2/3 matched to the tRFs of a polygon in TReFID were assigned a score sum. The procedure is described in detail under Results for cases in which the following defined conditions were met.

Polygon.

A graph is formed by all tRFs of one GenBank entry (thus for one organism or one sequence deposited) by use of all restriction enzymes (Fig. 2). For evaluation of the polygons, only those formed by the tRFs with a threshold of ≥2/3 are taken for analysis of the polygons deposited in the databank.

Match.

tRF is retrieved from the TReFID databank, with a deviation between expected and experimentally obtained lengths of not more than 1.5%.

Score.

Deviations between determined and theoretical tRF lengths as found in the TReFID bank are scored as follows: results of ≤0.5% receive a score of 1.0; results between 0.5 and 1.5% receive a score of 0.5; results between 1.0 and 1.5% receive a score of 0.25; and results of >1.5% receive a score of 0.

Score sum.

The sum for the tRFs with all the restriction enzymes employed referred to a single database entry (a polygon for an organism).

Threshold.

For the threshold value, ≥2/3 of restriction enzymes have to provide a tRF with a score of >0 for an analyzed sequence.

TReFID.

TReFID represents the terminal restriction fragment identifying program.

RESULTS

Content of the tRFs for nifH, the 16S rRNA gene, and nosZ in the TReFID databank.

The terminal restriction fragment identifying program, termed TReFID (see Materials and Methods) (Fig. 1) was developed to retrieve the tRF pattern of a specific organism for each of the three nifH, 16S rRNA, and nosZ genes within the total tRF data deposited in the TReFID bank. For this, in the case of nifH 1,873 GenBank sequences available in October 2003 have been examined with respect to their nifHF primer binding site content. Of these, 1,031 did not contain this site and 20 did not show sufficient homology for primer binding. Primers constructed from other nifH regions would not have provided more sequences (unpublished data). Thus, the resulting 822 sequences had to be sufficient for analysis by computer for restriction sites downstream from the nifHF binding site. This procedure provided for each restriction enzyme a different number of distinguishable, unique tRFs, ranging from a minimum of 30 (with HinfI) to a maximum of 78 (with MspI or Bsh1236I). This is visualized by the black dots in the axes of the spider web graph of Fig. 2I, in which data are presented for the tRFs derived from the GenBank sequences (panel a) and for those experimentally obtained from the DNA isolated from the Dünnwald soil sample (panel b). In the case of nifH, only eight restriction enzymes out of the 13 readily available with a 4-bp recognition site were employed. In total, for the eight restriction enzymes, 6,020 tRFs were obtained, of which only 505 were unique (Table 1). This small number is due to the fact that a cut within a conserved motif can give identical tRFs for different organisms. All 6,020 tRFs were taken for further analysis. Thus, for nifH, 822 polygons, each representing an entry for an organism, were deposited in the TReFID databank. A total of 683 polygons were derived from the GenBank sequences, and 139 polygons came from our clone library. To demonstrate the feature of a polygon for a specific bacterium, the tRFs for Azospirillum brasilense Sp7 are visualized, as represented by the shaded area in Fig. 2I, panel a. All the different polygons can be used to screen for related entities (organisms) with the tRFs provided by an environmental sample.

TABLE 1.

Summary of the total sequences and tRFs obtained for the three 16S rRNA, nifH, and nosZ genes

No. of restriction enzymes providing a tRF 16S rRNA gene
nifH
nosZ
No. of sequences Total no. of tRFsa No. of sequences Total no. of tRFsb No. of sequences Total no. of tRFsc
13 1,087 14,131
12 3,995 47,940
11 4,715 51,865
10 3,557 35,570
9 2,501 22,509 42 378
8 1,002 8,016 138 1,104 63 504
7 378 2,646 375 2,625 18 126
6 141 846 235 1,410 5 30
5 44 220 108 540 0 0
4 35 140 57 228 0 0
3 5 15 37 111 0 0
2 2 4 1 2 0 0
1 0 0 0 0 0 0
0 0 0 0 0 0 0
Total 17,462 183,902 951 6,020 128 1,038
a

Number of unique tRFs, 4,536. Number of entries per unique tRF, 40.5.

b

Number of unique tRFs, 505. Number of entries per unique tRF, 11.9.

c

Number of unique tRFs, 129. Number of entries per unique tRF, 8.0.

In the case of the 16S rRNA gene, all 13 restriction enzymes were employed. The sequence information for this gene in GenBank, plus 135 sequences from our Dünnwald clone library, was more extensive than the data for nifH. Altogether, 31,927 sequences out of the 73,245 available in January 2003 were analyzed. Among these, 17,462 containing the 63F primer binding site were suitable for the generation of tRFs from the 63F primer (Table 1). This provided 183,902 tRFs altogether, of which only 4,536 were unique. The 17,462 polygons developed for 16S rRNA gene represent a solid basis of data for further analysis.

Unfortunately, the sequence information for nosZ is limiting. Only 85 out of 181 GenBank sequences available in October 2003 were suitable for constructing tRFs by use of the nosZR primer which binds near the 3′ terminus of the gene. Together with the 43 sequences of the clone library, these GenBank sequences provided 1,038 tRFs, with only 129 being unique (Table 1). The present meager sequence information for nosZ is reflected by the few dots in the corresponding spider web graph (Fig. 2II, panel a) and the low number (128) of polygons obtained. In contrast, the sequence information for the 16S rRNA gene is much more than can be shown in a spider web graph similar to those for nifH and nosZ, because an axis for any restriction enzyme would contain too many dots (tRFs) to be graphically resolved.

Demonstration of the applicability of the approach by using DNA from an environmental sample.

The applicability of the method was tested with forest soil from the vicinity of Cologne. The loamy sand soil of the Dünnwald forest was selected because this soil was not rich in nitrogen, in contrast to many other locations in the Cologne area which are N saturated (16). Therefore, we expected a relatively high abundance of both denitrifying and N2-fixing bacteria which would enable us to retrieve tRFs of both groups. DNA isolated from this soil was used for PCR amplifications with the fluorochrome-labeled nifHF primer. Restriction digests with the eight different enzymes yielded 284 tRFs in total (Fig. 2I, panel b). These tRFs were then examined for related entities (organisms) with the tRF pattern in the nifH databank by use of the TReFID program (Table 1). When a tRF for a given restriction enzyme from the sample matched with one of the databank within 0.5% of the total nucleotide length, the score was set to 1.0. When the similarity between the determined length of a Dünnwald tRF and that of the closest one from the TReFID databank differed up to 1%, the score was set to 0.5. When such a difference was even 1.5%, the score was defined as 0.25. Any value outside of this range was defined as 0. Those restriction enzymes which did not provide a tRF for a segment because of the absence of a restriction site within the 30- to 500-bp sequence were not considered any further.

The polygon for Vibrio diazotrophicus was selected to illustrate the method. The enzymes AluI and MboI had no restriction site within the 400 bp between the nifHF and nifHR motifs and therefore could not be employed for analysis. Among the tRFs of the residual six restriction enzymes used, five gave the maximal score, meaning that the size of the best matching tRF showed less than 0.5% deviation from the predicted one. However, one enzyme (HinfI) gave a fragment of nearly the same size (being maximally 1.5% larger or smaller than that predicted fragment for V. diazotrophicus). The score sum for V. diazotrophicus in the Dünnwald soil is thus 5.25 out of 6 (88% identity). Thus, the program indicates with high fidelity that an organism occurs in the Dünnwald soil which is closely related to V. diazotrophicus.

As another example, in the case of the noncultured clone DUN1+B26, all eight restriction enzymes provided a tRF, but the similarity value was 6.75 out of 8 due to results that included six scores of 1.0, one score of 0.5, and one score of 0.25. The overall similarity between the pool of tRFs from the Dünnwald soil and DUN1+B26 was 6.75/8 (84%), indicating that tRFs of a bacterium closely related to the deposited clone from Dünnwald, DUN1+B26, had been retrieved. The method does not permit one to decide whether this bacterium retrieved by the tRF pattern was identical to that from the clone library. Clearly, the lower the score sum (in percent similarity), the lower is the probability that an organism with sequence similarity to any bacterium of the TReFID bank is present in the environmental sample. A match in only two cases, for example, could have been due to restriction sites at the same sequence position in two totally unrelated bacteria and would therefore be meaningless.

A threshold value was arbitrarily set for all three genes. In the case of the 16S rRNA gene, the tRFs, obtained from at least 9 restriction enzymes out of the 13 maximally used and providing a tRF within the range of 30 to 500 bp, had to give a score of 1.0, 0.5, or 0.25 (and thus matches of at least two of three) to be considered for further analysis (Table 2). Thus, the tRFs of the DNA of an environmental sample also form a polygon. To be used for the further analysis, at least 9 tRFs of the DNA of this sample had to match the 13 tRFs of one polygon present in TReFID. Cases of matches giving results of <2/3 were discarded. If one (or more) restriction enzyme(s) had no site within the PCR fragment analyzed, the tRF for this enzyme was left out. Matches were then referred to 12 (or less) enzymes, but the required threshold value was also kept at ≥2/3.

TABLE 2.

Similarities of the tRFs polygons obtained from DNA of the Dünnwald soil with those of the TReFID databank

Similaritya (% calculated score) No. of tRF polygons retrieved in the TReFID results list
16S rRNA nifH nosZ
94-100 122 0 1
87-93.9 215 7 4
80-86.9 640 3 7
73-79.9 999 47 5
66-72.9 414 105 7
59-65.9 0 29 0
52-58.9 0 0 0
Total 2,390 191 24
a

A high percent similarity value indicates that a bacterium occurs in the Dünnwald soil which is closely related to one with its sequence in the TReFID databank. A value of 100% means a perfect match (within <0.5 %) for all restriction enzymes employed. The relatedness (or probability of occurrence) decreases with similarity percentage.

All the matches giving results of ≥2/3 were further analyzed for their score sum by the TReFID program. An overall similarity value (i.e., the score sum/number of restriction enzymes with a restriction site in the sequence range analyzed) between 94 and 100% was regarded as being highly indicative of similarity to a polygon consisting of tRFs for a specific bacterium of the TReFID databank. In the case of the 16S rRNA gene, the majority of sequences of the TReFID results (1,037) listed for the Dünnwald soil had similarity values between 73 and 79% (Table 2).

In the case of nifH, five out of eight restriction enzymes had to give a match with a score of 1.0, 0.5, or 0.25 (i.e., a match of two out of three at least) for the sequences to be considered any further. Remarkably, the average similarity value was about 10% less in the case of nifH than that for the 16S rRNA gene (Table 2).

The same protocol was tried for nosZ (Fig. 2II). The threshold was set to six out of nine restriction enzymes utilized (i.e., a match of at least two out of three). The highest matches of a polygon of Dünnwald soil DNA were to Pseudomonas stutzeri, with a score sum of 7.0 out of 8 restriction enzymes that had a restriction site inside the fragment analyzed, Paracoccus pantotrophus, with a score sum of 6.0 out of 7, and Azospirillum lipoferum, with a score sum of 6.0 out of 8. Thus, organisms closely related to the denitrifying bacteria mentioned occurred in the Dünnwald soil. The observed similarity value of the Dünnwald sequences for nosZ was as low as that for nifH (Table 2), although the size of the databank for nosZ was too small to allow us to draw a definitive conclusion here.

Controls to assess the quality of the TReFID program and to ascertain the feasibility of the approach. Control A.

The 16S rRNA gene tRFs obtained from the Dünnwald soil DNA were examined for matches with any of the 135 polygons of the Dünnwald clone library as part of the TReFID databank (Fig. 3a). For this examination, the stringency in the two parameters used to compare the tRFs from the soil DNA and Dünnwald clone library was varied. The ordinate in the figure indicates the percentage of hits of polygons (i.e., the sum of tRFs for each bacterium deposited) obtained from the DNA isolated from the Dünnwald soil (i.e., entity 1) within the total entries of the Dünnwald clone library (i.e., entity 2). The percentage of matches (polygon similarities) between the two entities ranging between one of two and three of four is given in one abscissa, whereas the percentage of deviations between the lengths of the tRFs from entity 1 and 2 between 0.33 and 1.0% is shown in the other. Figure 3a reveals that a high percentage of tRFs retrieved from the soil DNA are represented in the Dünnwald clone library. However, the correlation between both entities cannot be perfect, since the 135 clones deposited into the library cannot always be retrieved with different DNA preparations from a soil. In addition, DNA isolation, PCR amplification, and sequencing may have caused false negatives which are particularly obvious in cases of extreme stringency where three-fourths of the tRFs have to be present in both entities with a maximal deviation of 0.33% (Fig. 3). Taking all these difficulties into account, the proportion of sequences matching between both entities is remarkably high. Clearly, direct sequencing of the clones obtained provided more accurate data but is much more time consuming and expensive than TReFID analysis.

FIG. 3.

FIG. 3.

Identification of polygons representing organisms in the Dünnwald soil in two libraries. DNA was isolated from the Dünnwald soil, and the lengths of the tRFs obtained with all 13 restriction enzymes were experimentally determined for the 16S rRNA gene. The lengths of the tRFs were then compared with those calculated using the sequences from the Dünnwald clone library (a) and from GenBank (b). The ordinate represents the percentages of polygons of the soil DNA among the total polygons of either the Dünnwald clone library (a) or the GenBank sequences (b). One abscissa (front side) denotes the proportions of enzymes providing a tRF with a match corresponding to the total number of enzymes employed and providing tRFs for analysis. The other abscissa (right side) denotes the deviations in the lengths of the tRFs. The percentage of sequences retrieved was low when the deviation was restricted to ≤0.33% and was high at a deviation of ≤1.00%. For the further calculations, the values 0.66 (for the number of matches) and 0.5% (for the deviation) were selected (dark column in both parts of the figure). However, score values of 1.0, 0.5, or 0.25 (for definitions, see text) were taken for the calculations and this figure.

The tRFs from the Dünnwald soil DNA were also screened for their occurrence in sequences from the total TReFID databank, presently containing 17,327 entries (Fig. 3b). Quite a lot of polygons can apparently be retrieved from this bank. However, few polygons are exactly the same when developed from the environmental sample (Dünnwald soil DNA) and compared with those deposited in the TReFID databank (Fig. 3b, front right-side corner column).

From the Dünnwald soil DNA, 1,373 polygons formed by tRFs scoring above the threshold (i.e., ≥2/3 matches) in the total 17,327 entries of the 16S rRNA gene TReFID databank could be retrieved altogether. The corresponding values for the nifH and nosZ sequences were 130 out of 824 and 23 out of 97 total entries, respectively. To demonstrate the specificity of the TReFID bank, tRFs of the Dünnwald soil DNA were screened in the TReFID bank for the heterologous, false genes. For nifH tRFs, only three matching 16S rRNA gene polygons were detected in the 17,360 sequences of the 16S rRNA gene database, whereas all other combinations (tRFs for nifH in the nosZ bank, for the 16S rRNA gene in either the nifH or nosZ banks, and for nosZ in either the 16S rRNA gene or nifH banks) gave negative results.

Control B.

The TReFID data allowed the construction of synthetic tRFLP profiles of the genes in the Dünnwald soil DNA assayed on the basis of the tRF entries in the TReFID result list, as exemplified in Fig. 4, for the 16S rRNA gene by the use of Bsh1236I (a), MboI (b), and RsaI (c). The different heights in the peaks of the profile constructed from the TReFID database reflect the abundances of tRFs at distinct nucleotide lengths. The 16S rRNA gene tRFs of the Dünnwald soil were obtained by PCR using fluorochrome labeling, and their nucleotide lengths were determined experimentally. They corresponded well with those of the reconstructed profile. However, the peak height was small in some cases and was slightly above the peak detection threshold value, set at 20 relative fluorescence units (see peaks at 135, 325, or 380 nt in Fig. 4b). Thus, it was not clear in these cases whether such small peaks indicated a hit for a tRF. Despite this drawback, this procedure for constructing a synthetic tRF profile allows verification of the composition of a bacterial population for any environmental habitat with respect to the 16S rRNA, nifH, and nosZ genes or any other gene with a tRF data bank when this approach is applied to the tRFs obtained for all restriction enzymes.

FIG. 4.

FIG. 4.

Comparison in the tRF profiles obtained experimentally with DNA from the Dünnwald soil and predicted from the sequence information in the TReFID result list. DNA was isolated from the Dünnwald soil probe, and the tRF lengths were determined experimentally. By using the polygon constructions from the tRFs of all 13 restriction enzymes, a list of bacteria in the Dünnwald soil which are closely related to organisms with entries in the TReFID databank could be compiled. The tRFs of these closest relatives in the databank were then taken to construct the predicted tRF profile of the DNA from the Dünnwald soil.

Control C.

The whole 16S rRNA gene TReFID databank was examined for the occurrence of organisms with tRF profiles (polygons) similar to those of three selected organisms picked out of the same databank, namely, Escherichia coli (E05133), Azospirillum brasilense (AY324110), and Rhizobium leguminosarum (AF5333683). As before, the match in the tRFs had to be at least two of three to hit a polygon of one of the three organisms. The TReFID databank provided 537 polygons related to the above-named organisms. Their sequences available in the TReFID databank showed that only 257 of them were at least 500 bp in length and thus suitable for constructing a phylogenetic tree using the neighbor-joining method. A total of 192 sequences remained after the elimination of identical sequences deposited with different accession numbers. These were utilized to construct a phylogenetic tree for the same region used for predicting the tRFs. The phylogenetic tree provided a clear separation of the potential organisms belonging to E. coli, R. leguminosarum, or A. brasilense in 183 out of 192 cases (Fig. 5). However, sequences of three bacteria (Pseudoalteromonas prydzensis, Shewanella gelidimarina, and Idiomarina baltica) (plus three closely related sequences which did not separate in the phylogenetic tree), as well as sequences from three noncultured proteobacteria, did not fit at all. Therefore, these artifacts were analyzed in more detail. In the case of S. gelidimarina (AF530149), the eight enzymes AluI, Bsh1236I, Cfr13I, HaeIII, HinfI, MboI, MspI, and RsaI provided a tRF with the same length (score, 1.0) both for this bacterium and for E. coli. Two restriction enzymes had no site in the segment analyzed and therefore could not be considered. Despite the fact that the score was 0 with the two enzymes Hin6I and TaiI, the matches due to eight positive-testing enzymes were above the threshold level of two out of three. Thus, S. gelidimarina could not be excluded by the method employed. However, the phylogenetic tree clearly indicates that S. gelidimarina is at best distantly related to E. coli (85.1% homology between the two organisms). A similar calculation resulted in the selection of the other two giving false positives: P. prydzensis and I. baltica. In sum, however, the method employed is reliable in more than 95% of all cases, but care has to be taken to discern the few false positives. Thus, this approach allows retrieval not only of the composition of bacteria down to the genus level but also of most of the bacteria related to a specific organism (e.g., E. coli) from an environmental sample.

FIG. 5.

FIG. 5.

The occurrence of the 16S rRNA gene tRF polygons related to those from Azospirillum brasilense, Escherichia coli, or Rhizobium leguminosarum. The TReFID program provided 192 polygons related to those of the three organisms and deposited in the TReFID databank (8 for A. brasilense, 111 for E. coli, and 64 for R. leguminosarum). Their sequences were used to construct the phylogenetic tree by use of the neighbor-joining method. The tree shows that only six artifacts were obtained (see text). The average sequence homologies between the three clusters of organisms (A. brasilense, E. coli, and R. leguminosarum) were 70.2, 80.8, and 71.1%, respectively. The numbers in the figure refer to the following: 1, Roseomonas fauriae (AF533354); 2, Roseomonas genomospecies (AY150050); 3, Azospirillum sp. (AB049110); 4, Devosia riboflavina (AF501346); 5, Agrobacterium tumefaciens (AF508094); 6, Agrobacterium tumefaciens (AF406666); 7, Sinorhizobium kummerowiae (AF364067); 8, Sinorhizobium meliloti (AF533685); 9, Rhizobium tropici (U89832); 10, Mesorhizobium plurifarium (AF516882); 11, Ochrobactrum sp. (AF452128); 12, Ochrobactrum anthropi (AF501340); 13, Idiomarina baltica (AJ440214); 14, Xenorhabdus nematophila (AF522294); 15, Pectobacterium carotovorum (AF373189); 16, Serratia quinivorans (AJ279050); 17, Serratia sp. (AF511524); 18, Serratia odorifera (AF286870); 19, Pectobacterium carotovorum (AF373184); 20, Erwinia amylovora (AF141892); 21, Escherichia albertii (AJ508775); 22, Escherichia coli (AY319394); 23, Obesumbacterium proteus (AY077753); 24, “Escherichia senegalensis” (AY217654); 25, “Dickeya dadantii” (AF520707); 26, Shewanella gelidimarina (AF530149); 27, Pseudoalteromonas sp. (AB055788); 28, Pseudoalteromonas prydzensis (U85855).

Bacterial composition of the Dünnwald soil sample as assessed by the TReFID program.

The TReFID program enabled us to retrieve 2,393 bacterial 16S rRNA gene sequences from the Dünnwald soil DNA related to the polygons deposited in the databank (Table 3). Among these, 930 sequences were dissimilar. The method provided 123 different genera with 227 species of known affiliation. Genera in cases in which the species was undefined (such as Azospirillum sp.) were regarded as unclassified. These and other unclassified bacteria represented about 60% of the total and thus the largest portion in this soil (Table 4). The next-largest part consisted of α-proteobacteria. However, other major groups of soil bacteria that included acidobacteria and actinobacteria also occurred. Others (about 10%) consisted of spirochetes, firmicutes, bacteroidetes, fusobacteria, fibrobacteres, planctomycetes, δ-proteobacteria, and cyanobacteria; these latter were Planktothricoides raciborskii, “Lyngbya hieronymnsii,” “Planktothrix agardhii,” “P. rubescens,” “Trichodesmium havanum,” Anabaena compacta, Nostoc sp., and several noncultured forms, among which some have been described only for marine habitats as yet. For nifH, the amount of sequences retrieved by this approach was not so large due to the limited number of sequences in the TReFID data bank (Table 3). Surprisingly, the proportion of sequences related to unclassified bacteria possessing this gene amounted to 194 (87%) and was thus high. This was an unexpected, new result obtained by the use of the TReFID program. However, sequences related to members of the classical N2-fixing groups (Rhizobiales, Rhodobacterales, and Pseudomonadales) were also detected. The file obtained for bacteria with nosZ was small, but the highest percentage of these again belonged to the unclassified organisms. The actinobacteria and acidobacteria retrieved from the 16S rRNA gene sequences were not present in the list of bacteria with nifH or nosZ. Only a few bacteria, Bradyrhizobium japonicum, Mesorhizobium ciceri, Methylosinus trichosporium, and Rhodovulum strictum, were found in the lists for both the 16S rRNA gene and nifH. Bacteria included in the list for the 16S rRNA gene and nosZ were Paracoccus denitrificans, P. pantotrophus, and Pseudomonas fluorescens. One bacterium, Pseudomonas stutzeri, possessed both nifH and nosZ but was not included in the list of the 16S rRNA gene.

TABLE 3.

Results for the groups of organisms retrieved from the Dünnwald soil by using the TReFID databank

Category Result
16S rRNA nifH nosZ
Total no. of matches 2,390 224 23
    No. of different genera (classified in GenBank) 123 11 3
    No. of different species (classified in GenBank) 227 13 7
    No. of organisms not classified to genus level in GenBank 1,440 194 15
    % Unclassified 60 87 65
No. of common matches with the 16S rRNA gene:
    On the genus level 6 3
    On the species level 4 3

TABLE 4.

Taxa retrieved from the Dünnwald soil by using the TReFID databank

Taxon No. of species or genera for indicated gene
16S rRNA
nifH
nosZ
Species Genera Species Genera Species Genera
Alphaproteobacteria (including “sp.”) 431 74 7 5 5 2
    Rhizobiales 192 28 5 4 2 1
    Rhodobacterales 96 17 2 1
    Rhodospirillales 28 14 3 1
Betaproteobacteria 37 6 1 1
    Burkholderiales 32 2 1 1
Gammaproteobacteria 113 14 4 4 2 1
    Pseudomonadales 9 3 1 1 2 1
Actinobacteria 52 9
Cyanobacteria 31 8
Acidobacteria 3 1

DISCUSSION

To develop the TReFID program, all 30,000 16S rRNA gene sequences deposited after January 2001 and before January 2003 were examined for the presence of the utilized primer region and for a highly conserved motif in this region to allow primer binding. These criteria led to the selection of the 17,327 sequences out of the total of 73,245 entries available in January 2003, when the database was set up. Deposits before January 2001 were not used because of uncertainties in the sequence information and because the 17,327 were regarded as being sufficient to characterize the diversity in bacterial communities. Theoretically, any bacterium could be identified if its 16S rRNA gene tRF profile is present in the databank. However, as pointed out by several investigators (reviewed in reference 7), the observed and the predicted tRF lengths do not always match in environmental samples. Since these discrepancies between the predicted and observed lengths are sequence specific (7) and also experimentally depend on the lengths determined by the scanners employed, a window had to be created to account for these deviations in the lengths of tRFs. Thus, the technique allows retrieval only of those sequences that are closely related to sequences in the databanks but does not allow unambiguous identification of species.

The tRF analysis also does not permit the determination of the abundance of single sequences in an environmental sample. This is because DNA templates differ in their primer homologies and because bacteria can contain different copy numbers of the 16S rRNA gene (4). In addition, PCR can hardly be exactly reproducible with different DNA preparations from the same environmental habitat unless extensive and careful calibrations are performed. Despite drawbacks, the 17,327 16S rRNA gene polygons in the TReFID database allow characterization of the population structure in an environmental sample at the DNA sequence level. The relative percentages of sequences retrieved which are related to a specific bacterium, as exemplified for E. coli, Azospirillum brasilense, and Rhizobium leguminosarum (Fig. 5), depend on the number of entries in the databank but do not reflect the situation in an environmental sample. As in other investigations, a sequence completely identical to one deposited in GenBank is rarely retrieved from a soil or other environmental sample. This might reflect the existence of a sequence continuum among the 104 ribotypes per g of soil (19), as was also inferred from other investigations (16).

The present report seems to be the first attempt to characterize N2-fixing and denitrifying bacterial profiles in an environmental sample by a tRF analysis. The nifH gene coding for nitrogenase reductase is highly conserved among N2-fixing bacteria (22), and more sequences are available for this gene than for the other two structural (nifDK) genes or for any other nif gene. The 822 polygons deposited in the TReFID databank are sufficient to allow characterization of the composition of a population of N2-fixing bacteria in environmental samples, as shown for the Dünnwald soil, but the resolution will improve as more nifH sequences become available. Clearly, the situation is presently in infancy for nosZ or any other gene coding for a gene of denitrification. Thus, the present communication merely indicates that the technique can theoretically also be employed for denitrification.

The TReFID program presented here and the phylogenetic assignment tool recently published (6) should complement each other in characterizations of members of a bacterial community by their 16S rRNA gene tRF profiles. The two methods differ in several respects. The approach by Kent et al. (6) employs a hierarchical algorithm which allows the identification of organisms by analysis of the tRFs provided by the restriction enzymes in a consecutive order (see Fig. 1 of reference 6). Their tRF data have been automatically generated by MiCA (http://mica.ibest.uidaho.edu). The TReFID program utilizes polygons of tRFs of defined bacterial sequences obtained from databanks or a clone library. The TReFID program is presently more comprehensive for analysis of 16S rRNA gene sequences. It contains many sequences from bacteria not yet cultured. The program by Kent et al. (6) utilizes the 8F primer, whereas TReFID employs the 63F region, which is strongly conserved among the 16S rRNA gene sequences deposited in the databanks (13). In addition, many sequences of GenBank contain the 63F but not the 8F region. Only the use of the 63F region as a primer binding site enabled us to develop the more than 17,000 polygons used here for the 16S rRNA gene. The program described in reference 6 could not easily install such a detailed phylogram as that shown in Fig. 5 (C. Rösch, unpublished).

Previous molecular analyses of soil or marine samples suggested a relatively low species richness with respect to nifH but not the nosZ or the 16S rRNA gene (15, 16, 17, 21). The present study with DNA from the Dünnwald soil indicated by the use of the TReFID program that most nifH sequences retrieved were related to those of noncultured bacteria whose sequences have been deposited in the GenBank. It is presently not clear whether this result represents a special feature of the Dünnwald soil or whether the tRFLP analysis more readily accesses noncultured bacteria than other molecular approaches tried so far. To resolve this issue, a more detailed analysis of the bacterial population and its dynamics after N fertilization is presently under way with soil samples from Dünnwald and also from other locations. This is now possible, since the tRFLP TReFID program allows rapid analysis of the bacterial composition of an environmental sample and avoids time-consuming and expensive cloning and sequencing.

Acknowledgments

We are indebted to M. G. Yates and G. B. Lewes, formerly at the Unit of Nitrogen Fixation of the ARC in Brighton, United Kingdom, for helpful comments and for improving the English of the manuscript. The excellent technical assistance of Emmanuelle Mounier, Stefanie Backhausen, Mirela Stecki, and Karin Otto is gratefully acknowledged.

This study was kindly supported by grants from the GEW Stiftung (Cologne, Germany) and the Deutsche Bundesstiftung Umwelt (Osnabrück, Germany).

REFERENCES

  • 1.Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Amann, R. I., W. Ludwig, and K. H. Schleifer. 1995. Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol. Rev. 59:143-169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Chatzinotas, A., R. A. Sandaa, W. Schonhuber, R. Amann, F. L. Daae, V. Torsvik, J. Zeyer, and D. Hahn. 1998. Analysis of broad-scale differences in microbial community composition of two pristine forest soils. Syst. Appl. Microb. 21:579-587. [DOI] [PubMed] [Google Scholar]
  • 4.Fogel, C. B., C. R. Collins, J. Li, and C. F. Brunk. 1999. Prokaryotic genome size and SSU rDNA copy number: estimation of microbial relative abundance from a mixed population. Microb. Ecol. 38:93-113. [DOI] [PubMed] [Google Scholar]
  • 5.Hall, T. A. 1999. BioEdit: a user-friendly, biologically sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41:95-98. [Google Scholar]
  • 6.Kent, A. D., D. J. Smith, B. J. Benson, and E. W. Triplett. 2003. Web-based phylogenetic assignment tool for analysis of terminal restriction fragment length polymorphism profiles of microbial communities. Appl. Environ. Microbiol. 69:6768-6776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kitts, C. L. 2001. Terminal restriction fragment patterns: a tool for comparing microbial communities and assessing community dynamics. Curr. Issues Intest. Microbiol. 2:17-25. [PubMed] [Google Scholar]
  • 8.Larsen, N. 2004. Chimeric check tool; part of the ribosomal Database Project II, 2.7 ed. http://rdp.cme.msu.edu/cgis/phylip.cgi.
  • 9.Liu, W.-T., T. L. Marsh, H. Cheng, and L. J. Forney. 1997. Characterization of microbial diversity by determining terminal restriction fragment length polymorphisms of genes encoding 16S rRNA. Appl. Environ. Microbiol. 63:4516-4522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Marchesi, J. R., T. Sato, A. J. Weightman, T. A. Martin, J. C. Fry, S. J. Hiam, D. Dymnock, and W. G. Wade. 1998. Design and evaluation of useful bacterium-specific PCR primers that amplify genes coding for bacterial 16S rRNA. Appl. Environ. Microbiol. 64:795-799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Mayr, C., A. Winding, and N. B. Hendriksen. 1999. Community level physiological profile of soil bacteria unaffected by extraction method. J. Microbiol. Methods 36:29-33. [DOI] [PubMed] [Google Scholar]
  • 12.Muyzer, G., E. C. de Waal, and A. G. Uitterlinden. 1993. Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA. Appl. Environ. Microbiol. 59:695-700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Osborn, A. M., E. R. B. Moore, and K. N. Timmis. 2000. An evaluation of terminal-restriction fragment length polymorphisms (T-RFLP) analysis for the study of microbial community structure and dynamics. Environ. Microbiol. 2:39-50. [DOI] [PubMed] [Google Scholar]
  • 14.Page, R. D. 1996. TreeView: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 12:357-358. [DOI] [PubMed] [Google Scholar]
  • 15.Rich, J. J., R. S. Heichen, P. J. Bottomley, K. J. Cromack, and D. D. Myrold. 2003. Community composition and functioning of denitrifying bacteria from adjacent meadow and forest soils. Appl. Environ. Microbiol. 69:5974-5982. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Rösch, C., A. Mergel, and H. Bothe. 2002. Biodiversity of denitrifying and dinitrogen-fixing bacteria in an acid forest soil. Appl. Environ. Microbiol. 68:3818-3829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Scala, D. J., and L. J. Kerkhof. 1998. Nitrous oxide reductase (nosZ) gene-specific PCR primers for detection of denitrifiers and three nosZ genes from marine sediments. FEMS Microbiol. Lett. 162:61-68. [DOI] [PubMed] [Google Scholar]
  • 18.Thomson, J. D., T. L. Gibson, F. Plewniak, F. Jeannougin, and D. G. Higgins. 1997. The Clustal_X Windows Interface; flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876-4882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Torsvik, V., J. Goksoyr, and F. L. Daae. 1990. High diversity in DNA of soil bacteria. Appl. Environ. Microbiol. 56:782-787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Weisburg, W. G., S. M. Barns, D. A. Pelletier, and D. J. Lane. 1991. 16S ribosomal DNA amplification for phylogenetic study. J. Bacteriol. 173:697-703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Widmer, F., B. T. Shaffer, L. A. Porteous, and R. J. Seidler. 1999. Analysis of nifH gene pool complexity in soil and litter at a Douglas fir forest site in the Oregon Cascade Mountain Range. Appl. Environ. Microbiol. 65:374-380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Zehr, J. P., B. D. Jenkins, S. M. Short, and G. F. Steward. 2003. Nitrogenase gene diversity and microbial community structure: a cross-system comparison. Environ. Microbiol. 5:539-554. [DOI] [PubMed] [Google Scholar]

Articles from Applied and Environmental Microbiology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES