Skip to main content
Journal of Bacteriology logoLink to Journal of Bacteriology
. 1999 Apr;181(8):2358–2362. doi: 10.1128/jb.181.8.2358-2362.1999

Genetic Diversity in the Protective Antigen Gene of Bacillus anthracis

Lance B Price 1, Martin Hugh-Jones 2, Paul J Jackson 3, Paul Keim 1,*
PMCID: PMC93658  PMID: 10197996

Abstract

Bacillus anthracis is a gram-positive spore-forming bacterium that causes the disease anthrax. The anthrax toxin contains three components, including the protective antigen (PA), which binds to eucaryotic cell surface receptors and mediates the transport of toxins into the cell. In this study, the entire 2,294-nucleotide protective antigen gene (pag) was sequenced from 26 of the most diverse B. anthracis strains to identify potential variation in the toxin and to further our understanding of B. anthracis evolution. Five point mutations, three synonymous and two missense, were identified. These differences correspond to six different haploid types, which translate into three different amino acid sequences. The two amino acid changes were shown to be located in an area near a highly antigenic region critical to lethal factor binding. Nested primers were used to amplify and sequence this same region of pag from necropsy samples taken from victims of the 1979 Sverdlovsk incident. This investigation uncovered five different alleles among the strains present in the tissues, including two not seen in the 26-sample survey. One of these two alleles included a novel missense mutation, again located just adjacent to the highly antigenic region. Phylogenetic (cladistic) analysis of the pag corresponded with previous strain grouping based on chromosomal variation, suggesting that plasmid evolution in B. anthracis has occurred with little or no horizontal transfer between the different strains.


Bacillus anthracis is the causative organism of the potentially fatal disease anthrax. Virulent forms of B. anthracis carry two large plasmids, pX01 (ca. 174 kb) and pX02 (ca. 95 kb). Virulence factors include toxin and capsule production, encoded on pX01 and pX02, respectively. The anthrax toxin is composed of three proteinaceous subunits: (i) lethal factor (LF), the toxin component thought to kill host cells by disrupting the mitogen-activated protein kinase pathway (2); (ii) edema factor (EF), an adenylyl cyclase that causes skin edema in the infected host (6); and (iii) protective antigen (PA), which binds to eucaryotic cell surface proteins, forms homoheptamers, and then binds to and internalizes EF and LF.

The structure and function of PA have been well described. The entire PA gene (pag) sequence has been published and is available in GenBank (accession no. M22589) (12). The three-dimensional structure has also been solved and is available in the NCBI Entrez 3D database (MMDB no. 6980) (10). Finally, antibody-binding experiments have been used to define regions of the PA protein critical to cell surface attachment as well as LF binding (8). Missing from the literature until now was a population study of pag from diverse strains of B. anthracis to define the natural variation in this important gene.

In past studies, plasmid-specific genetic variation in B. anthracis has been largely ignored. A recent population study, based on chromosomal markers, demonstrated that B. anthracis is one of the most monomorphic bacterial species known (5). This chromosomal amplified fragment length polymorphism study examined ca. 6.3% of the B. anthracis genome for length variations and ca. 0.36% for point mutations. However, due to ambiguities arising from the absence of one or both of the plasmids, plasmidal data were omitted from the final results. Studies of pX01 diversity and especially of pag are essential to understanding evolution of pathogenesis in B. anthracis. Likewise, comparative studies of plasmid-based versus chromosomal variation can provide insight into the frequency of horizontal plasmid transfer in natural B. anthracis populations.

In this study we have sequenced the entire pag gene from 26 of the most diverse strains of B. anthracis (5). These sequences were aligned and analyzed for point mutations then studied phylogenetically to determine if the pag data are consistent with chromosomal diversity groups. Additionally, we sequenced a 307-bp variable region of pag from 10 Sverdlovsk anthrax victim necropsy samples (4) in order to identify novel pag sequences.

MATERIALS AND METHODS

B. anthracis DNA.

Culture conditions, DNA isolation methods, and diversity groups are described in reference 5. Necropsy tissue DNA was isolated as described by Jackson et al. (4).

PCR amplification of DNA.

Table 1 contains the sequences for all primers used for this project. These were designed from the published pag sequence (GenBank accession no. M22589) and synthesized by Gibco/BRL, Bethesda, Md. All primer positions cited throughout this report are based on this GenBank sequence. Two DNA fragments, together totaling 2,531 bp of sequence, were initially amplified to provide a pag sequencing template from the 26 B. anthracis strains. PA-1F and PA-1R were used to amplify a 1,191-bp fragment containing the 5′ portion of PA. This included 131 bp of upstream flanking sequence. PA-2F and PA-2R were used to amplify a 1,449-bp fragment containing the 3′ portion of PA. This included 106 bp of downstream flanking sequence. The two fragments contained 109 bp of overlapping sequence near the middle of the gene. Fifty-microliter PCR mixtures contained 1× PCR buffer (20 mM Tris [pH 8.4], 50 mM KCl; Gibco/BRL), 0.10 mM deoxynucleoside triphosphates, 4 mM MgCl2, ∼0.2 ng of template DNA per μl, 0.04 U of Taq DNA polymerase (Gibco/BRL) per μl, and 0.4 μM forward and reverse primers, adjusted to 50 μl with filtered (0.2-μm-pore-size filter) 17.8 mohm E-pure water. Reactions were heated to 94°C for 5 min and then subjected to 35 cycles, each consisting of 30 s at 94°C, 30 s at 62°C, and 1.5 min at 72°C. This was followed by heating to 72°C for 5 min to complete primer extension. PCR products were purified through Qiaquick purification minicolumns (Qiagen Inc., Valencia, Calif.) and then quantified on ethidium bromide-stained 1.25% agarose-Tris-acetate-EDTA gels. These purified fragments were then used in subsequent sequencing reactions. PCR amplification of necropsy sample DNA was performed as described by Jackson et al. (4), using primers PA-5F, PA-5R, PA-5Fnest, and PA-5Rnest (Table 1).

TABLE 1.

Primers used in this study

Primer Typea Sequence (5′→3′) Positionb Amplicon size (bp)
PA-1F Amp/Seq ATA TTT ATA AAA GTT CTG TTT AAA AAG CC 5′-1673 1,191
PA-1R Amp/Seq TAA ATC CTG CAG ATA CAC TCC CAC 3′-2840 1,191
PA-2F Amp/Seq ATA AGT AAA AAT ACT TCT ACA AGT AGG ACA C 5′-2755 1,449
PA-2R Amp/Seq GAT TTA GAT TAC TGT TTA AAA CAT ACT CTC C 3′-4173 1,449
PA-3 Seq TCA TGT AAC AAT GTG GGT AGA TGA C 5′-2145  NAc
PA-4 Seq CTC TAT GAG CCT CCT TAA CTA CTG AC 3′-3717 NA
PA-5F Amp ATC CTA GTG ATC CAT TAG AAA CGA C 5′-3416 330
PA-5R Amp CTT CTC TAT GAG CCT CCT TAA CTA CTG 3′-3719 330
PA-5Fnest Amp/Seq AGT GAT CCA TTA GAA ACG AC 5′-3421 307
PA-5Rnest Amp/Seq TAA CTA CTG ACT CAT CCG C 3′-3709 307
a

Amp, used for amplification; Seq, used for sequencing; Amp/Seq, used for both amplification and sequencing. 

b

All correspond to GenBank accession no. M22589 nucleotide positions. 

c

NA, not applicable. 

DNA sequencing.

PCR products were sequenced on an ABI model 377 fluorescence sequencer using a PRISM Ready Reaction BigDye terminator cycle sequencing kit (both from Perkin-Elmer/Applied Biosystems Inc., Foster City, Calif.). Sequences were aligned and analyzed with Sequence Navigator software (Perkin-Elmer/Applied Biosystems).

Cladistic analysis.

Cladistic analysis was performed on the pag sequences by using maximum parsimony with PAUP 3.1.1 software (developed by David L. Swofford, Illinois Natural History Survey) and manual examinations of sequence polymorphisms.

Three-dimensional analysis.

The PA structure has been solved and is available on the NCBI Entrez 3D database (MMDB no. 6980) (10). Amino acid residues shown to vary among strains were identified on the three-dimensional structure, and then physical distances from the putative LF binding region of PA domains 3 and 4 were estimated by using MAGE 4.5 software (developed by David Richardson, Biochemistry Department, Duke University, Durham, N.C.).

RESULTS

Sequence alignment of the entire PA gene from 26 strains representative of the five B. anthracis diversity groups (5) (Table 2) revealed five point mutations, three synonymous and two missense, shown in Table 3. All five mutations are transitions. Two of the synonymous mutations occur only once. However, the other differences are present with frequencies ranging from 3/26 to 20/26. The two missense mutations are located adjacent to a highly antigenic region crossing the junction between PA domains 3 and 4 shown to be critical to LF binding (Fig. 1) (8, 10). The different mutational combinations observed in this study give rise to six PA genotypes and three PA phenotypes (Table 4).

TABLE 2.

B. anthracis strains used in this study

Strain Geographic origin or description Diversity groupa PA genotypeb PA phenotypec
BA0052 Jamaica Sterne-Ames I FPA
BA1087 Scotland Sterne-Ames I FPA
J611 Indonesia Sterne-Ames I FPA
BA1031 South Africa Sterne-Ames I FPA
BA1043 South Africa Sterne-Ames I FPA
28 Ohio Sterne-Ames II FPA
MOZ-3 Mozambique Southern Africa III FPA
BA1035 South Africa Southern Africa III FPA
33 South Africa Southern Africa IV FPA
A24 Slovakia Southern Africa V FPV
K20 South Africa (Kruger) Kruger V FPV
26/05/94 Zambia Kruger V FPV
BA1033 South Africa WNA V FPV
BA1017 Haiti WNA V FPV
BA1015 Maryland WNA V FPV
93-194C Canada WNA V FPV
93-195C-8 Canada WNA V FPV
BA1040 Colorado WNA V FPV
BA1007 Iowa WNA V FPV
2/6 Turkey WNA V FPV
Pak-2 Pakistan WNA V FPV
STI-1 Russian vaccine strain WNA V FPV
F-1 South Korea Vollum V FPV
BA1024 Ireland Vollum VI FSV
ASC-3 United Kingdom Vollum VI FSV
BA1009 Pakistan Vollum VI FSV
a

Diversity designations are consistent with those described by Keim et al. (5). 

b

Described in Table 4

c

Designated by the single-letter designations of the three amino acids shown to vary in this study. 

TABLE 3.

Mutations identified in this study

Mutation Nucleotide positiona Base change Frequency Amino acid change
1 1998 C⇔T 20/26 Synonymous
2 2883 G⇔A 1/26 Synonymous
3 3481 T⇔C NAb F⇔L
4 3496 C⇔T 3/26 P⇔S
5 3602 C⇔T 17/26 A⇔V
6 3606 T⇔C 1/26 Synonymous
7 3672 A⇔G NA Synonymous
a

Nucleotide positions are based on the 4,235-bp pX01 sequence from Sterne strain, accession no. M22589, containing pag in its entirety. 

b

NA, not applicable (mutation was observed only in the Sverdlovsk samples). 

FIG. 1.

FIG. 1

Model of pag from B. anthracis. S, region of gene that codes for cleaved signal region; NP-F and NP-R, forward and reverse nested primers used to amplify variable regions from the Sverdlovsk tissue samples; black vertical arrows, missense mutations; grey vertical arrows, synonymous mutations; HAR, highly antigenic region important to LF binding (8, 10). Dom., domain.

TABLE 4.

PA genotypes and phenotypes identified in this studya

PA genotype PA phenotypeb Genotypic frequency Mutationc
1 2 3 4 5 6 7
I FPA 5/26 C G T(F) C(P) C(A) T A
II FPA 1/26 C G T(F) C(P) C(A) C A
III FPA 2/26 T G T(F) C(P) C(A) T A
IV FPA 1/26 T A T(F) C(P) C(A) T A
V FPV 14/26 T G T(F) C(P) T(V) T A
VI FSV 3/26 T G T(F) T(S) T(V) T A
VIISvd LPA NA C(L) C(P) C(A) T A
VIIISvd FPA NA T(F) C(P) C(A) T G
a

NA, not applicable (mutation was seen only in the Sverdlovsk samples); —, the region was not analyzed for the Sverdlovsk sample. 

b

Designated by the single-letter designation of the three amino acids shown to vary. 

c

Described in Table 3

Cladistic analysis of the 26 pag sequences was performed by the maximum parsimony method to produce a gene tree (Fig. 2A). The 26 strains grouped into four clades of 3, 3, 6, and 14 individuals. These groups were defined by three synapomorphic (informative) differences. In addition, we identified two apomorphic (uninformative) nucleotide differences (mutations 2 and 6) that separated two strains (28 and 33) from others in their clades. These mutations are identified on the respective branches but were not used to isolate these strains from their groups. The clades and topology identified by this tree were mostly congruent with those generated from chromosomal markers (Fig. 2B) (5). The only aberrations are the following. (i) Chromosomal data from strain A24 indicate that it is of the Southern Africa lineage (5), but the pag data place this strain with the Western North America (WNA) diversity group (one mutational step away); (ii) chromosomal data from strain F-1 indicate that it is of the Vollum lineage, but the pag data place this strain with the WNA diversity group (again, one mutational step away); and (iii) chromosomal markers indicate that the Kruger samples, although very similar, are genetically distinct from the WNA lineage. However, the pag gene tree did not resolve these two distinct groups. It should be noted that chromosomal markers indicate that Vollum and WNA are sister groups and, likewise, that Kruger and WNA are closely related. Only with strain A24 do the pag data suggest that strains from two distantly related groups (based on chromosomal markers) are closely related.

FIG. 2.

FIG. 2

Cladistic analysis of the 26 diverse strains. (A) Unrooted, maximum parsimony gene tree based on pag data developed in this study; (B) strain diversity groups based on chromosomal AFLP data described by Keim et al. (5). Branch mutations are numbered as described for Table 3.

To determine the pag genotypes and phenotypes of the strain(s) involved in the Sverdlovsk incident, nested PCR primers (Table 1) were designed to amplify and sequence a 307-bp region of pag. This region spans the junction between PA domains 3 and 4 where much of the variation was observed. This analysis uncovered two additional transition mutations (3 and 7 in Table 3). One was synonymous, while the other was a novel missense mutation resulting in a phenylalanine↔leucine change. These changes resulted in two additional genotypes and one new phenotype (Table 4). The amino acid change was, again, immediately adjacent to the highly antigenic region of PA domains 3 and 4 (Fig. 1). Repetitive sequencing of these tissues uncovered multiple PA genotypes within some of the individual necropsy samples. Together, five different PA genotypes were observed in the Sverdlovsk samples, with some samples showing evidence of infection by multiple strains (Table 5). This finding is consistent with the results of Jackson et al. (4).

TABLE 5.

Tissue samples from Sverdlovsk victims analyzed in this studya

Sample Tissue PA genotype(s) PA phenotype(s)
7.RA93.15.15 Spleen V FPV
40.RA93.40.5 Spleen VI, VIISvd FSV, LPA
27.RA93.30.3 Spleen V FPV
37.RA93.35.4 Vaccination site I,b V FPA, FPV
37.RA93.35.6 Lung VIIISvd FPA
3.RA93.1.1 Meninges VIIISvd FPA
25.RA93.031 Meninges V FPV
1.RA93.42.1 Meninges V FPV
33.RA93.20.5 Meninges V FPV
21.RA93.38.4 Lymph node V FPV
a

Determination of PA genotypes and phenotypes was based solely on the 307-bp region connecting PA domains 3 and 4. Multiple strains were identified in some tissues. 

b

Due to the limited region analyzed, this strain may be type I, II, or IV but was grouped with I for simplicity. 

Figure 3 is an unrooted phylogenetic tree demonstrating the five mutational steps leading to the six PA genotypes and three PA phenotypes identified in this study. Additionally, the putative positions of the Sverdlovsk samples are shown. However, because the Sverdlovsk identifications were based on just the 307-bp region around the antigenic portion of PA domains 3 and 4, these placements are only tentative.

FIG. 3.

FIG. 3

Unrooted phylogenetic tree of PA genotypes. Open boxes show the three PA phenotypes identified; shaded boxes show the possible positions of the Sverdlovsk genotypes, VIISvd and VIIISvd. Synonymous mutations are shown in open circles, and missense mutations are shown in closed circles. Each mutation is described in Table 3 and the phenotypes are described in Table 4.

Three-dimensional analysis of all the amino acid changes observed in this study (mutations 3, 4, and 5 in Table 3) indicated that these changes are not only close sequentially but also very close in three-dimensional space to the antigenic region important for LF binding. Mutation 3 (Phe to Leu), is ca. 11.2 Å, mutation 4 (Pro to Ser) is ca. 20.3 Å, and mutation 5 (Ala to Val) is ca. 19.0 Å from the central portion of this region. These spatial distances were estimated solely on peptide backbone-to-peptide backbone relationships. However, when the three-dimensional spaces occupied by the side chains of the amino acids were considered, changes were found to affect residues as close as 6.9 Å from the central amino acids of this critical antigenic region.

DISCUSSION

The protective antigen protein is central to the virulence associated with anthrax toxin. Elucidation of PA variation and its encoding gene could lead to a better understanding of B. anthracis virulence and evolution. Until now, pag had been sequenced in its entirety only from a single B. anthracis strain (12). In this study, a detailed analysis of the entire pag sequence (2,294 bp) from 26 diverse B. anthracis strains revealed only five point mutations, corroborating the high degree of genetic monomorphism found by Keim et al. (5).

Among these mutations, there is a disproportionate number of missense (two) to synonymous (three) changes. A common ratio of missense to synonymous mutations is approximately 1:5; here we see a ratio more than threefold greater (7). These missense mutations are located near a highly antigenic region, critical to LF binding. In monoclonal antibody studies, Little et al. demonstrated that by blocking an epitope between amino acids Ile-581 and Asn-601 (Fig. 1), they could effectively block LF binding to PA (8). Three-dimensional analysis indicated that the missense mutations identified in our study are very close in three-dimensional space to this antigenic region. While none of the three missense mutations were dramatic, such as a change from an extremely hydrophobic to a hydrophilic amino acid, the proline-to-serine change has the potential to make important three-dimensional alterations, since proline isomerization is known to play a critical role in protein folding. Because of their close proximity, these amino acid changes have the potential to effect LF binding, either directly or indirectly, within an infected host. The grouping of these missense mutations near this antigenic region and the disproportionate number of missense to synonymous mutations suggests adaptive variation. One of the two new mutations identified in the Sverdlovsk victims’ tissues was found to be a novel missense mutation located, sequentially and three dimensionally, near the highly antigenic region of the junction between PA domains 3 and 4. When these mutations are included with those identified in the 26-sample survey, the ratio of missense to synonymous mutation is increased to 3.8:5.

The amplification and sequencing of the 307-bp pag fragment from the Sverdlovsk tissue samples suggested that at least five different strains of B. anthracis were present in the samples and that some of the individual victims had been infected with multiple strains. These data corroborate earlier work with the vrrA locus that suggested that multiple strains of anthrax had been released during the 1979 incident (1, 4). Besides the Russian vaccine strain STI-1, included in this study, these tissue samples are a rare glimpse at the different strains of B. anthracis that are thought to be endemic in the vast region of the former Soviet Union. The fact that two previously unobserved mutations were found in the Sverdlovsk samples stresses the importance of collecting and analyzing B. anthracis strains from areas where anthrax is endemic but largely uncharacterized by molecular genetic analysis.

Independent cladistic analysis of pX01 by using the pag sequence has enabled us to estimate the likelihood of horizontal transfer of this plasmid between different B. anthracis strains in natural populations. Although horizontal transfer in Bacillus spp. is possible under laboratory conditions, the similarity of the cladistic grouping from the pag data to that of the chromosomal markers suggests that the differences in pag arose from evolution within particular strain lineages and were not a result of horizontal pX01 transfer. The single possible exception is associated with the A24 sample, which chromosomally is related to the Southern Africa strains, while the pag data for this strain are consistent with Kruger-WNA. This is either a result of convergent evolution or evidence of horizontal pX01 transfer. Further, it should be noted that the data presented in this report do not rule out the potential for horizontal transfer of plasmid pX01 between closely related strains within an infected host.

The unrooted phylogenetic tree (Fig. 3) is a useful tool for demonstrating the relationships between the different PA genotypes. However, it is not meant to infer an evolution toward a particular form of PA. Although distant homologues from other gram-positive bacteria are cited (3, 9, 11), none of these is close enough to root a B. anthracis PA phylogenetic tree. Without an ancestral PA sequence, one is unable to determine which PA phenotypes are ancestral and which are derived.

ACKNOWLEDGMENTS

We thank James M. Schupp, Kimothy L. Smith, Debra M. Adair, and Karen K. Hill for technical support.

REFERENCES

  • 1.Andersen G L, Simchock J M, Wilson K H. Identification of a region of genetic variability among Bacillus anthracis strains and related species. J Bacteriol. 1996;178:377–384. doi: 10.1128/jb.178.2.377-384.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Duesbery N S, Webb C P, Leppla S H, Gordon V M, Klimpel K R, Copeland T D, Ahn N G, Oskarsson M K, Fukasawa K, Paull K D, Vande Woude G F. Proteolytic inactivation of MAP-kinase-kinase by anthrax lethal factor. Science. 1998;280:734–737. doi: 10.1126/science.280.5364.734. [DOI] [PubMed] [Google Scholar]
  • 3.Gibert M, Perelle S, Daube G, Popoff M R. Clostridium spiroforme toxin genes are related to C. perfringens iota toxin genes but have a different genomic localization. Syst Appl Microbiol. 1997;20:337–347. [Google Scholar]
  • 4.Jackson P J, Hugh-Jones M E, Adair D M, Green G, Hill K K, Kuske C R, Grinberg L M, Abramova F A, Keim P. PCR analysis of tissue samples from the 1979 Sverdlovsk anthrax victims: the presence of multiple Bacillus anthracis strains in different victims. Proc Natl Acad Sci USA. 1998;95:1224–1229. doi: 10.1073/pnas.95.3.1224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Keim P, Kalif A, Schupp J, Hill K, Travis S E, Richmond K, Adair D M, Hugh-Jones M, Kuske C R, Jackson P. Molecular evolution and diversity in Bacillus anthracis as detected by amplified fragment length polymorphism markers. J Bacteriol. 1997;179:818–824. doi: 10.1128/jb.179.3.818-824.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Leppla S H. Anthrax toxin edema factor: a bacterial adenylate cyclase that increases cyclic AMP concentrations of eukaryotic cells. Proc Natl Acad Sci USA. 1982;79:3162–3166. doi: 10.1073/pnas.79.10.3162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Li W, Graur D. Fundamentals of molecular evolution. Sunderland, Mass: Sinauer Associates, Inc.; 1991. [Google Scholar]
  • 8.Little S F, Novak J M, Lowe J R, Leppla S H, Singh Y, Klimpel K R, Ligerding B C, Friedlander A M. Characterization of lethal factor binding and cell receptor binding domains of protective antigen of Bacillus anthracis using monoclonal antibodies. Microbiology. 1996;142:707–715. doi: 10.1099/13500872-142-3-707. [DOI] [PubMed] [Google Scholar]
  • 9.Perelle S, Gibert M, Boquet P, Popoff M R. Characterization of Clostridium perfringens iota-toxin genes and expression in Escherichia coli. Infect Immun. 1993;61:5147–5156. doi: 10.1128/iai.61.12.5147-5156.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Petosa C, Collier R J, Klimpel K R, Leppla S H, Liddington R C. Crystal structure of the anthrax toxin protective antigen. Nature. 1997;385:833–838. doi: 10.1038/385833a0. [DOI] [PubMed] [Google Scholar]
  • 11.Selvapandiyan, A. 1998. Direct submission. Genbank accession no. Y17158.
  • 12.Welkos S L, Lowe J R, Eden-McCutchan F, Vodkin M, Leppla S H, Schmidt J J. Sequence and analysis of the DNA encoding protective antigen of Bacillus anthracis. Gene. 1988;69:287–300. doi: 10.1016/0378-1119(88)90439-8. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES