Skip to main content
International Journal of Evolutionary Biology logoLink to International Journal of Evolutionary Biology
. 2012 Mar 26;2012:342482. doi: 10.1155/2012/342482

Comparative Analyses of Base Compositions, DNA Sizes, and Dinucleotide Frequency Profiles in Archaeal and Bacterial Chromosomes and Plasmids

Hiromi Nishida 1,*
PMCID: PMC3321278  PMID: 22536540

Abstract

In the present paper, I compared guanine-cytosine (GC) contents, DNA sizes, and dinucleotide frequency profiles in 109 archaeal chromosomes, 59 archaeal plasmids, 1379 bacterial chromosomes, and 854 bacterial plasmids. In more than 80% of archaeal and bacterial plasmids, the GC content was lower than that of the host chromosome. Furthermore, most of the differences in GC content found between a plasmid and its host chromosome were less than 10%, and the GC content in plasmids and host chromosomes was highly correlated (Pearson's correlation coefficient r = 0.965 in bacteria and 0.917 in archaea). These results support the hypothesis that horizontal gene transfers have occurred frequently via plasmid distribution during evolution. GC content and chromosome size were more highly correlated in bacteria (r = 0.460) than in archaea (r = 0.195). Interestingly, there was a tendency for archaea with plasmids to have higher GC content in the chromosome and plasmid than those without plasmids. Thus, the dinucleotide frequency profile of the archaeal plasmids has a bias toward high GC content.

1. Introduction

DNA base composition, specifically guanine-cytosine (GC) content, is a bacterial taxonomic marker. For example, actinobacteria have high, whereas clostridia have low GC-containing genomes [1]. In addition, assessing the dinucleotide frequency profile, a genome signature, of a genomic DNA sequence is a powerful tool to compare different chromosomes and plasmids [26]. In bacterial chromosomes, GC content and DNA size are correlated [710]. In bacterial phages, plasmids, and inserted sequences, the GC contents are lower than those of their host chromosomes [11].

Replication of and transcription from plasmid DNA are controlled mainly by factors encoded by the chromosome of the host organism. Therefore, it is hypothesized that the GC content and genome signature of a plasmid are similar to those of the chromosome of the host organism. In addition, it is believed that horizontal gene transfers have occurred frequently via plasmid distribution during evolution [12]. For example, a cell-cell communication system may be distributed among the genus Streptomyces using horizontal gene transfer via plasmids [13].

Prokaryotes consist of 2 evolutionarily distinct groups: archaea and bacteria [14]. Comparative genomics in bacteria is very advanced, while the whole genome sequence data of archaea is currently limited. Due to recent developments in DNA sequence technology, more than 100 archaeal genome sequences have been elucidated. In this study, I compared GC contents, DNA sizes, and dinucleotide frequency profiles in archaeal and bacterial chromosomes and plasmids.

2. Materials and Methods

In this study, 109 archaeal chromosomes, 59 archaeal plasmids, 1379 bacterial chromosomes, and 854 bacterial plasmids were used from the database OligoWeb, searching oligonucleotide frequencies (http://insilico.ehu.es/oligoweb/). According to the annotation of the database OligoWeb, chromosomes and plasmids were distinguished. Pearson's correlation coefficient calculation, statistical tests, and drawing plots were performed using the software R (http://www.r-project.org/).

3. Results

The 59 archaeal plasmids and 854 bacterial plasmids are distributed into 26 and 393 organisms, respectively. Some of the archaea and bacteria have 2 or 3 chromosomes. Therefore, in total, the 26 archaeal host organisms and 393 bacterial host organisms have 28 and 441 chromosomes, respectively. The GC contents of bacterial plasmids were found to be lower than those of the host chromosomes (Figure 1, Supplementary Table S1), which is consistent with a previous study [11]. In addition, the GC contents of archaeal plasmids were also lower than those of the host chromosomes (Figure 2, Supplementary Table S2). Furthermore, 777 (81.5%) of the 953 pairs of bacterial chromosome and plasmid, and 57 (85.1%) of the 67 pairs of archaeal chromosome and plasmid showed that the plasmid GC content is lower than that of its host chromosome (Figure 3). In addition, 746 (78.3%) of the 953 bacterial pairs and 47 (70.1%) of the 67 archaeal pairs showed less than 10% difference between GC content of the plasmid and its host chromosome (Figure 3).

Figure 1.

Figure 1

Boxplot of GC contents in bacterial plasmids and host chromosomes. Circles indicate the GC content (%) of each plasmid or chromosome, and lines link each plasmid to its host chromosome. The data set was shown in Supplementary Table S1 available online at doi:10.1155/2012/342482.

Figure 2.

Figure 2

Boxplot of GC contents of archaeal plasmids and host chromosomes. Circles indicate the GC content (%) of each plasmid or chromosome, and lines link each plasmid to its host chromosome. The data set was shown in Supplementary Table S2.

Figure 3.

Figure 3

Histogram showing the difference between GC contents of plasmids and host chromosomes. Frequency means the number of pairs of chromosome and plasmid.

The GC contents in plasmids and the host chromosomes were highly correlated in both bacteria and archaea (Pearson's correlation coefficient r = 0.965 and r = 0.917, respectively; Figures 4 and 5, resp.). Furthermore, in terms of size, the GC content and chromosome size were more highly correlated in bacteria than archaea (Figures 6 and 7, Supplementary Tables S3 and S4). Pearson's correlation coefficients between GC content and chromosome size of archaea and bacteria were 0.195 and 0.460, respectively. In archaea, organisms with high GC content chromosome tend to have plasmid (Figures 2 and 7). Thus, the dinucleotide frequency profile of the archaeal plasmids has a bias toward high GC content (Figure 8).

Figure 4.

Figure 4

Scatter plot of GC contents of bacterial plasmids and host chromosomes. The Pearson's correlation coefficient is 0.965. The data set was shown in Supplementary Table S1.

Figure 5.

Figure 5

Scatter plot of GC contents of archaeal plasmids and host chromosomes. The Pearson's correlation coefficient is 0.917. The data set was shown in Supplementary Table S2.

Figure 6.

Figure 6

Scatter plot of GC contents and chromosome sizes in bacteria. Red and blue circles indicate chromosomes with and without plasmids, respectively. Red and blue lines indicate the regression lines. The data set was shown in Supplementary Table S3.

Figure 7.

Figure 7

Scatter plot of GC contents and chromosome sizes in archaea. Red and blue circles indicate chromosomes with and without plasmids, respectively. Red and blue lines indicate the regression lines. The data set was shown in Supplementary Table S4.

Figure 8.

Figure 8

Boxplots of dinucleotide frequency profiles in chromosomes and plasmids of archaea and bacteria. Archaeal chromosomes, archaeal plasmids, bacterial chromosomes, and bacterial plasmids had frequency profiles of 109, 59, 1379, and 854, respectively.

4. Discussion

I hypothesize that GC content, a genomic signature, of a plasmid is related to host specificity and host range. Here, I showed that the GC content of a plasmid is lower than that of its host chromosome (Figures 1 and 2). However, in most cases, the difference in GC content between a plasmid and its host chromosome was less than 10% (Figure 3), strongly suggesting that host organisms cannot maintain and regulate plasmids with very different base compositions.

On the other hand, some organisms had a great difference in GC content between their chromosomes and plasmids. For example, in bacteria, Frankia symbiont of Datisca glomerata has the greatest difference (GC content of the chromosome is 70%; that of the plasmid pFSYMDG02 is 43.1%), and Desulfovibrio magneticus RS-1 has the second greatest difference (GC content of the chromosome is 62.8%; that of the plasmid pDMC2 is 37.2%) (Supplementary Table S1). I am so interested in the regulation system for these plasmids.

In this analysis, there was a tendency for plasmid-containing archaea to have higher GC content in the host chromosome and plasmid than those without plasmids (Figures 2, 5, and 7). I have no idea why archaea with mid- and low-GC chromosome tend to lack plasmids. The GC content bias was not found in bacteria (Figures 1, 4, and 6). Thus, although the dinucleotide frequency profiles between the bacterial chromosomes and plasmids were similar, those between the archaeal chromosomes and plasmids were different (Figure 8).

GC content and chromosome size in bacteria are weakly correlated (r = 0.460), which is consistent with previous reports [710]. However, the GC content and chromosome size in archaea are less correlated (r = 0.195). Considering these results, the relationship between GC content and chromosome size may differ in archaea and bacteria. In order to understand the high GC content bias of archaeal plasmids and elucidate the relationship between GC content and chromosome size in archaea, more archaeal genome sequence data are needed.

Supplementary Material

Supplementary Table S1: Pairs of chromosome and plasmid in bacteria.

Supplementary Table S2: Pairs of chromosome and plasmid in archaea.

Supplementary Table S3: Bacterial chromosomes compared in this analysis.

Supplementary Table S4: Archaeal chromosomes compared in this analysis.

Acknowledgment

The author thanks Professor Teruhiko Beppu for his valuable comments.

References

  • 1.Sueoka N. Variation and heterogeneity of base composition of deoxyribonucleic acids: a compilation of old and new data. Journal of Molecular Biology. 1961;3(1):31–40. [Google Scholar]
  • 2.Campbell A, Mrázek J, Karlin S. Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. Proceedings of the National Academy of Sciences of the United States of America. 1999;96(16):9184–9189. doi: 10.1073/pnas.96.16.9184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.van Passel MWJ, Bart A, Luyf ACM, van Kampen AHC, van der Ende A. Compositional discordance between prokaryotic plasmids and host chromosomes. BMC Genomics. 2006;7, article 26 doi: 10.1186/1471-2164-7-26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Mrázek J. Phylogenetic signals in DNA composition: limitations and prospects. Molecular Biology and Evolution. 2009;26(5):1163–1169. doi: 10.1093/molbev/msp032. [DOI] [PubMed] [Google Scholar]
  • 5.Suzuki H, Yano H, Brown CJ, Top EM. Predicting plasmid promiscuity based on genomic signature. Journal of Bacteriology. 2010;192(22):6045–6055. doi: 10.1128/JB.00277-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Suzuki H, Sota M, Brown CJ, Top EM. Using Mahalanobis distance to compare genomic signatures between bacterial plasmids and chromosomes. Nucleic Acids Research. 2008;36(22, article e147) doi: 10.1093/nar/gkn753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Musto H, Naya H, Zavala A, Romero H, Alvarez-Valín F, Bernardi G. Genomic GC level, optimal growth temperature, and genome size in prokaryotes. Biochemical and Biophysical Research Communications. 2006;347(1):1–3. doi: 10.1016/j.bbrc.2006.06.054. [DOI] [PubMed] [Google Scholar]
  • 8.Mitchell D. GC content and genome length in Chargaff compliant genomes. Biochemical and Biophysical Research Communications. 2007;353(1):207–210. doi: 10.1016/j.bbrc.2006.12.008. [DOI] [PubMed] [Google Scholar]
  • 9.Guo FB, Lin H, Huang J. A plot of G + C content against sequence length of 640 bacterial chromosomes shows the points are widely scattered in the upper triangular area. Chromosome Research. 2009;17(3):359–364. doi: 10.1007/s10577-009-9024-3. [DOI] [PubMed] [Google Scholar]
  • 10.Bentley SD, Parkhill J. Comparative genomic structure of prokaryotes. Annual Review of Genetics. 2004;38:771–792. doi: 10.1146/annurev.genet.38.072902.094318. [DOI] [PubMed] [Google Scholar]
  • 11.Rocha EPC, Danchin A. Base composition bias might result from competition for metabolic resources. Trends in Genetics. 2002;18(6):291–294. doi: 10.1016/S0168-9525(02)02690-2. [DOI] [PubMed] [Google Scholar]
  • 12.Davison J. Genetic exchange between bacteria in the environment. Plasmid. 1999;42(2):73–91. doi: 10.1006/plas.1999.1421. [DOI] [PubMed] [Google Scholar]
  • 13.Nishida H, Ohnishi Y, Beppu T, Horinouchi S. Evolution of γ-butyrolactone synthases and receptors in Streptomyces. Environmental Microbiology. 2007;9(8):1986–1994. doi: 10.1111/j.1462-2920.2007.01314.x. [DOI] [PubMed] [Google Scholar]
  • 14.Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proceedings of the National Academy of Sciences of the United States of America. 1990;87(12):4576–4579. doi: 10.1073/pnas.87.12.4576. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Table S1: Pairs of chromosome and plasmid in bacteria.

Supplementary Table S2: Pairs of chromosome and plasmid in archaea.

Supplementary Table S3: Bacterial chromosomes compared in this analysis.

Supplementary Table S4: Archaeal chromosomes compared in this analysis.


Articles from International Journal of Evolutionary Biology are provided here courtesy of Wiley

RESOURCES