Staphylococcus aureus is a widespread, hospital- and community-acquired pathogen, many strains of which are antibiotic resistant. It causes diverse diseases, ranging from local to systemic infection, and affects both the skin and many internal organs, including the heart, lungs, bones, and brain.
KEYWORDS: Staphylococcus aureus, bacteriophage lysis, bacteriophage therapy, bacteriophages, bioinformatics, computational biology, efficiency of plating, evolution, GWAS, phage host range, phage resistance, spot assay
ABSTRACT
Staphylococcus aureus is a human pathogen that causes serious diseases, ranging from skin infections to septic shock. Bacteriophages (phages) are both natural killers of S. aureus, offering therapeutic possibilities, and important vectors of horizontal gene transfer (HGT) in the species. Here, we used high-throughput approaches to understand the genetic basis of strain-to-strain variation in sensitivity to phages, which defines the host range. We screened 259 diverse S. aureus strains covering more than 40 sequence types for sensitivity to eight phages, which were representatives of the three phage classes that infect the species. The phages were variable in host range, each infecting between 73 and 257 strains. Using genome-wide association approaches, we identified putative loci that affect host range and validated their function using USA300 transposon knockouts. In addition to rediscovering known host range determinants, we found several previously unreported genes affecting bacterial growth during phage infection, including trpA, phoR, isdB, sodM, fmtC, and relA. We used the data from our host range matrix to develop predictive models that achieved between 40% and 95% accuracy. This work illustrates the complexity of the genetic basis for phage susceptibility in S. aureus but also shows that with more data, we may be able to understand much of the variation. With a knowledge of host range determination, we can rationally design phage therapy cocktails that target the broadest host range of S. aureus strains and address basic questions regarding phage-host interactions, such as the impact of phage on S. aureus evolution.
IMPORTANCE Staphylococcus aureus is a widespread, hospital- and community-acquired pathogen, many strains of which are antibiotic resistant. It causes diverse diseases, ranging from local to systemic infection, and affects both the skin and many internal organs, including the heart, lungs, bones, and brain. Its ubiquity, antibiotic resistance, and disease burden make new therapies urgent. One alternative therapy to antibiotics is phage therapy, in which viruses specific to infecting bacteria clear infection. In this work, we identified and validated S. aureus genes that influence phage host range—the number of strains a phage can infect and kill—by testing strains representative of the diversity of the S. aureus species for phage host range and associating the genome sequences of strains with host range. These findings together improved our understanding of how phage therapy works in the bacterium and improve prediction of phage therapy efficacy based on the predicted host range of the infecting strain.
INTRODUCTION
There is no licensed vaccine for Staphylococcus aureus, and many clinical strains are resistant to multiple antibiotics. For these reasons, alternative treatments such as bacteriophage therapy are being actively investigated (1, 2). Phage therapy has some advantages over using antibiotics. Phages show little or no human toxicity, and the high diversity of natural phages available to be isolated for treatment suggests that complete resistance would be hard to evolve (3, 4). However, there is no natural phage known to kill all S. aureus strains, and for that reason, phage cocktails (mixtures of phages with nonoverlapping host ranges) are necessary. Rational cocktail formulation requires comprehensive knowledge of the genetic factors that influence phage host range.
S. aureus phages and corresponding known host mechanisms regulating phage resistance and host range have previously been reviewed (1, 5, 6). Known S. aureus phages belong to the order Caudovirales (tailed phages) and are further divided into three morphological classes: the long, noncontractile-tailed Siphoviridae, the long, contractile-tailed Myoviridae, and the short, noncontractile-tailed Podoviridae (5). The Siphoviridae are temperate, while the Myoviridae and Podoviridae are virulent (5). The Siphoviridae bind either α-O-GlcNAc or β-O-GlcNAc attached at the four positions of wall teichoic acid (WTA) ribitol phosphate monomers, while the Podoviridae bind only β-O-GlcNAc-decorated WTA, and the Myoviridae bind either the WTA ribitol-phosphate backbone or β-O-GlcNAc-decorated WTA (1, 7, 8). S. aureus is known to produce polyribitol phosphate rather than polyglycerol phosphate WTA (9). WTA biosynthesis genes are conserved throughout the species, with the exception of the unusual sequence type ST395 (10), as are WTA glycosyltransferase genes tarM and tarS, but occasional tarM inactivation or absence provides Podoviridae susceptibility (11).
Currently identified resistance mechanisms in Staphylococcus species act at the adsorption, biosynthesis, and assembly stages of infection (1). Adsorption resistance mechanisms include receptor alteration, removal, or occlusion by large surface proteins or polysaccharides (capsule) (7, 11–16). Biosynthesis resistance mechanisms include halting the infection process through metabolic arrest (abortive infection) and adaptive (CRISPR) or innate (restriction-modification) immunity to phage infection through phage DNA degradation (17–21). Temperate phage and S. aureus pathogenicity islands (SaPIs) inserted in the genome may also offer barriers to Siphoviridae, through superinfection immunity and assembly interference, which occurs through SaPI parasitization of the packaging machinery of the infecting viruses (22–27).
While previous studies have identified numerous individual host resistance mechanisms in S. aureus, few have examined the importance of these mechanisms on a species-wide scale. In addition, although many S. aureus phages are reported to have wide host ranges (28–34), and even early studies suggested staphylococcal phage therapies to be highly effective (35), experiments conducted thus far have failed to explain the genetic bases of host range or resistance development in a species-wide manner. Only one previous study has associated genetic factors (gene families) with phage resistance by using a hypothesis-free method (36). This work used a two-step linear regression model to associate some 167 gene families, mostly of unknown function, with resistance assessed in 207 clinical methicillin-resistant S. aureus (MRSA) strains and 12 phage preparations. However, the study did not associate any other types of genetic changes with host range and examined only MRSA strains.
In this study, we associated multiple genetic factors—gene presence/absence, point mutations, and more complex polymorphisms—with S. aureus phage host range and resistance in a hypothesis-free, species-wide, and genome-wide manner. We used a novel high-throughput assay to determine resistance phenotypes of 259 strains challenged with eight S. aureus phages belonging to all three morphological categories (Siphoviridae, Myoviridae, and Podoviridae). We then used two bacterial genome-wide association study (GWAS) techniques to identify core genome single nucleotide polymorphisms (SNPs) and subsequences of length k (k-mers) significantly associated with each phenotype and used these significant features to develop predictive models for each phenotype. We also tested for associations between phenotypes and phylogeny, clonal complex (CC), and methicillin resistance (MRSA) and validated novel genes found to be associated with sensitivity or resistance in the GWAS through molecular genetics, thus complementing the hypothesis-free GWAS approach with hypothesis-driven experiments and demonstrating that GWAS-discovered determinants have causative effects on phage resistance.
RESULTS
Development of a novel high-throughput host range assay.
In order to evaluate host range for a large number of S. aureus strains in a quantitative manner, we developed a high-throughput host range assay (Fig. 1), described in Materials and Methods. This assay measures the extent to which phages cause retardation of growth compared to a control. Before using data from the high-throughput assay for further analysis, we calibrated it against the traditional spot assay (Fig. 1A), which measures whether phages cause lysis in a lawn of bacterial cells. We compared spot assay results (sensitive [S], semisensitive [SS], or resistant [R]) for 108 strains and five phages to the average final soft agar turbidity (optical density at 600 nm [OD600]) of the strains in the high-throughput assay (Fig. 1B). For all phages tested, turbidity was significantly higher (P < 0.05, Wilcoxon signed-rank test) for spot-resistant strains than for spot-sensitive strains. For all phages tested but p003p, the turbidity was significantly higher for spot-semisensitive strains than for spot-sensitive strains. However, for only phages p0006 and p003p were turbidities significantly higher for spot-resistant strains than for spot-semisensitive strains. Thus, for all phages but p003p, it was possible to tell spot-sensitive from spot-semisensitive strains by the high-throughput assay, but only for phages p0006 and p003p was it possible to tell spot-semisensitive from spot-resistant strains by the new assay. Overall, these results showed strong agreement between the lysis-based spot assay and the high-throughput growth-based assay for differentiating between sensitive and resistant/semisensitive phenotypes.
FIG 1.
Development of the high-throughput phage host range assay. (A) Examples of fully sensitive (NRS149) and fully resistant (NRS148) spot assay phenotypes for five test phages (p0045, p0006, p0017S, p002y, and p003p). (B) Calibration of the high-throughput assay against qualitative spot assay phenotypes (S, sensitive, complete clearing; SS, semisensitive, cloudy clearing; R, resistant, no clearing) determined with the spot assay for 108 NARSA strains and the five phages listed for panel A. Siphoviridae are listed in red, and Myoviridae are listed in blue. Data represent the distribution of average high-throughput assay measurements for strains evaluated as S, SS, or R in corresponding spot assays. Wilcoxon signed-rank test significance values for each possible comparison are listed at the top of the corresponding box plots (ns, not significant; *, P = 0.01 to 0.05; **, P = 0.001 to 0.01; ***, P = 0.0001 to 0.001; ****, P = 0 to 0.0001). (C) Example of high-throughput assay results from one 96-well plate containing overnight cultures of 96 NARSA strains coincubated with phage p0006. (D) Example of high-throughput assay phenotypes for a sensitive S. aureus strain, a resistant strain, bacteria without phage, and phage without bacteria.
Host range is associated with clonal complex but not methicillin resistance.
We evaluated the host range of eight phages belonging to the Siphoviridae, Myoviridae, and Podoviridae. Siphoviridae (p0045, p0017S, p002y, p003p, and p0040), Myoviridae (p0006 and pyo), and Podoviridae (p0017) were most closely related to others of the same class but were not related at all to those of other classes (see Table S1 at https://figshare.com/articles/dataset/Supplemental_Table_S1/13355909). Among the Siphoviridae, p003p was the most divergent from the others (between 97.75 and 97.83% similar to the others). On the host side, host range was determined for a set of 259 S. aureus strains representing 47 already defined sequence types (STs) and 17 already defined clonal complexes (CCs) against eight phages (253 strains with sequence data are included in Table S2 at https://figshare.com/articles/dataset/Supplemental_Table_S2/13355933). The most common STs were 5 (25.69%), 8 (13.04%), 30 (6.72%), 105 (4.35%), and 121 (3.16%), while the most common CCs were 5 (37.15%), 8 (23.32%), 30 (12.25%), 121 (5.14%), and 1 (4.74%), respectively. The most common strain isolation years were 2005 (31.92%), 2012 (14.08%), 2002 (12.68%), 2017 (7.51%), and 2018 (7.04%), while the most common isolation locations were the United States (61.26%), France (19.76%), the United Kingdom (11.46%), and Japan (1.19%). Strain isolation years ranged from 1935 to 2018.
Phages p0045 and p0040, i.e., the two temperate phages, and p0017, the sole tested podovirus, had the highest proportions of resistant strains (71.8, 38.2, and 35.9%, respectively) among those tested (Fig. 2A and Table 1). The average and median final turbidities among tested strains were likewise highest for these phages (average/median, 0.80/0.88, 0.61/0.60, and 0.56/0.54, for p0045, p0040, and p0017, respectively). On the other hand, phages p0017S, p002y, p003p, and pyo, all virulent Siphoviridae or Myoviridae, had the lowest proportions of resistant strains (0.8, 1.2, 1.2, and 1.5%, respectively) and average/median final turbidities (0.31/0.27, 0.27/0.22, 0.32/0.31, and 0.26/0.21, respectively). Phage p0006 had an intermediate proportion of resistant strains (15.4%) and average/median final turbidity (0.49/0.44). Strains were resistant to between zero and six phages (Fig. 2B), with a median of two. The strains NRS148, NRS209, and NRS255 were resistant to six phages, the most among any strains. Phage host ranges were most similar (concordant, defined by number of strains with identical phenotypes between two phages) between phages p0017S, p002y, p003p, and pyo but least similar between phage p0045 and the previous set of four phages (Fig. 2C).
FIG 2.
Host range distribution, concordance, and multiple phage resistance. (A) Number of strains that fall into host range categories for each phage. Sensitive (S) corresponds to an OD600 of 0.1 to 0.4, semisensitive (SS) corresponds to an OD600 of 0.4 to 0.7, and resistant (R) corresponds to an OD600 of 0.7 or higher. (B) Histogram of number of phages to which strains are resistant, by the previous definition. (C) Concordance matrix of the host ranges of the tested phages. Concordance is defined as the number of strains with identical phenotypes between two phages. Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple.
TABLE 1.
Summary statistics of phage host range phenotypesa
Phenotype | No. (%) of strains with indicated phenotype to phage: |
|||||||
---|---|---|---|---|---|---|---|---|
p0045 | p0006 | p0017 | p0017S | p002y | p003p | p0040 | Pyo | |
Sensitive | 25 (9.7) | 131 (50.6) | 89 (34.4) | 221 (85.3) | 225 (86.9) | 229 (88.4) | 64 (24.7) | 220 (84.9) |
Semisensitive | 48 (18.5) | 88 (34.0) | 77 (29.7) | 36 (13.9) | 31 (12.0) | 27 (10.4) | 96 (37.1) | 35 (13.5) |
Resistant | 186 (71.8) | 40 (15.4) | 93 (35.9) | 2 (0.7) | 3 (1.2) | 3 (1.2) | 99 (38.2) | 4 (1.5) |
Mean | 0.80 | 0.49 | 0.56 | 0.31 | 0.27 | 0.32 | 0.61 | 0.26 |
SD | 0.24 | 0.20 | 0.27 | 0.12 | 0.12 | 0.12 | 0.23 | 0.14 |
Median | 0.88 | 0.44 | 0.54 | 0.27 | 0.22 | 0.31 | 0.60 | 0.21 |
For each phage, the number of strains falling into each phenotype category were counted. These phenotypes were determined for each phage using the high-throughput assay. The numbers of sensitive (OD600, 0.1 to 0.4), semisensitive (0.4 to 0.7), and resistant (0.7 and higher) strains and percentages are listed first, followed by the mean, standard deviation (SD), and median quantitative phenotypes for all tested strains. Statistics summarize at least six biological replicates for each phage.
We also examined whether there were significant associations between clonal complex (CC) or methicillin-resistant S. aureus (MRSA) genetic background and each phage host range phenotype (Fig. 3). We hypothesized that CC would correlate with host range, given that type I restriction-modification specificity is strongly associated with CC (37, 38), restricting the infection of a strain by phage propagated in a strain of a different CC. We hypothesized that MRSA genetic background may also affect host range, given that the phage receptor WTA is required for methicillin resistance (39) but MRSA strains can tolerate more defects in WTA biosynthesis than methicillin-susceptible S. aureus (MSSA) strains (40). However, MRSA/MSSA phenotypic differences were only significant for phage p0040 (P < 0.001, Wilcoxon signed-rank test) (Fig. 3B). There were significant differences in phage resistance between individual CCs for all phages (P < 0.05, Tukey’s honestly significant differences based on one-way analysis of variance [ANOVA]) (see Table S4 at https://figshare.com/articles/dataset/Supplemental_Table_S4/13355942) and significant overall differences among all CCs (one-way ANOVA) for all phages (P < 0.05). Overall, these results indicate that MRSA genetic background for the most part is not associated with the host range of these phages, while CC overall affects the host ranges of all tested phages.
FIG 3.
Phage resistance is related to clonal complex (CC) but not MRSA genetic background. Data represent the distribution of average high-throughput assay measurements for strains belonging to each presented CC (all 259 strains) (A) or MRSA/MSSA (126 NARSA strains) (B) genetic background. One-way ANOVA significance values for overall differences among CCs presented and Wilcoxon signed-rank test significance values for MRSA/MSSA differences are indicated at the top of the corresponding box plots (ns, not significant; *, P = 0.01 to 0.05; **, P = 0.001 to 0.01; ***, P = 0.0001 to 0.001; ****, P = 0 to 0.0001). Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple.
Resistance to each phage is homoplasious, emerging independently in multiple CCs (Fig. 4). We estimated phylogenetic signal by calculating Moran’s I, Abouheif’s Cmean, Pagel’s λ, and Blomberg’s K (41) for each phage host range phenotype, which resulted in statistically significant values in every case (Table 2). Both Moran’s I and Abouheif’s Cmean values fell between 0.17 and 0.37. Pagel’s λ values all were nearly 1, while Blomberg’s K values approached 0. Pagel’s λ values around 1 and Moran’s I/Abouheif’s Cmean values around 0 support a Brownian motion model (the phylogeny structure alone best explains the trait distribution), but Blomberg’s K values around 0 suggest that trait variance at the tips is greater than that predicted by the phylogeny under a Brownian motion model. All calculated phylogenetic signal values were statistically significant (P < 0.05 for randomization tests based on 999 simulations). Taken together, these results suggest that the structure of the phylogeny might explain the host ranges of the tested phage as expected under a Brownian motion model (random distribution of phenotypes among strains directed by the phylogeny overall). This neutral phylogenetic signal agrees with the previous finding that CC is associated with host range (Fig. 3A; see Table S4). While there is a CC association with host range, strain-specific effects may be even stronger than CC-specific effects, resulting in weak net phylogenetic signals.
FIG 4.
Phage resistance across the S. aureus species. Average high-throughput phage host range assay phenotypes (of at least six replicates) and corresponding strain clonal complexes were placed on a maximum-likelihood, midpoint-rooted core genome phylogeny and visualized with the Interactive Tree of Life (iTOL) (107). Phenotypes are presented on a scale from blue (lowest OD600, most sensitive) to orange (highest OD600, most resistant). Phenotypes from inside to outside correspond to phages p0045, p0006, p0017, p0017S, p002y, p003p, p0040, and pyo. CCs are shaded inside and outside the circumference of the tree.
TABLE 2.
Measures of phylogenetic signal for each phage resistance phenotypea
Phage | Moran’s I | Abouheif’s Cmean | Pagel’s λ | Blomberg’s K |
---|---|---|---|---|
p0045 | 0.23 | 0.23 | 1.00 | 0.005 |
p0006 | 0.17 | 0.17 | 1.00 | 0.008 |
p0017S | 0.32 | 0.32 | 1.00 | 0.007 |
p002y | 0.23 | 0.23 | 1.00 | 0.008 |
p003p | 0.30 | 0.30 | 1.00 | 0.012 |
p0040 | 0.28 | 0.28 | 1.00 | 0.014 |
pyo | 0.36 | 0.37 | 1.00 | 0.006 |
p0017 | 0.31 | 0.31 | 1.00 | 0.014 |
Values that are significant are shown in bold. Significance was determined for 999 random permutations of the data.
GWAS reveals novel genetic determinants of host range.
We used the GWAS tools pyseer (42) and treeWAS (43) to identify genetic loci strongly associated with the phage host range phenotype (see Fig. S3A in the supplemental material; Table 3). We chose these tools because they represent two alternatives for population structure correction: identifying principal components of a distance matrix (pyseer) and testing against phenotypes simulated based on the phylogeny (treeWAS). pyseer identified clusters of orthologous genes (COGs), SNPs, and k-mers beyond the respective multiple-corrected significance thresholds in all phages. Most phages lacked k-mer P value inflation, with the exceptions of p0017S, p002y, and p003p, based on associated Q-Q plots (scatter above the diagonal at P values of 1e−2 or more indicated P value inflation) (Fig. S2). The number of significant COGs detected ranged from 48 (p0017S) to 347 (pyo). Significant SNPs were detected for all phages but p0045 and p0017S and ranged from 1 (p0017) to 249 (pyo). Significant SNPs were identified in tarJ (pyo, 672A>G synonymous) and tagH (p002y, 848T>C missense and 873A>T missense; pyo, 848T>C missense, 873A>T missense, and 876C>T synonymous). TarJ is responsible for activating ribitol phosphate with CTP to form CDP-ribitol (44), while TagH is a component of the ABC transporter that exports WTA to the cell surface (9). A substantial number of the significant p0017 k-mers [1,382; −log(P value) = 12.259] mapped to the recently discovered host range factor tarP. TarP was shown to confer podovirus resistance by transferring N-acetylglucosamine to the C-3 position of ribitol phosphate (14). Significant k-mers also mapped to hsdS [32 for p002y, −log(P value) = 9.33; 6 for p003p, −log(P value) = 8.54], oatA [2 for p002y, −log(P value) = 7.75; 3 for p003p, −log(P value) = 8.45], and tagH [11 for p002y, −log(P value) = 9.47; 10 for p003p, −log(P value) = 8.81]. HsdS determines the sequence specificity of SauI restriction-modification (37), while OatA, or peptidoglycan O-acetyltransferase, is required for phage adsorption at least in S. aureus strain H (45). Prophage-associated genes [186 k-mers for phage tail fiber gene SRX477019_02350 for phage p0045, −log(P value) = 12.21; 37 k-mers for the same gene for p0040, −log(P value) = 8.69] were the most significantly associated with two of the tested Siphoviridae, i.e., phages p0045 and p0040. This result agrees with the known temperate phage resistance mechanism of superinfection immunity, in which prophages express a repressor gene that prevents transcription of lytic genes of superinfecting phages (46).
TABLE 3.
GWAS summary statistics for each associated genetic element
Phage | No. of unique genetic elementsa |
|||||||
---|---|---|---|---|---|---|---|---|
p0045 | p0006 | p0017 | p0017S | p002y | p003p | p0040 | Pyo | |
COGs (pyseer) | 131 | 49 | 76 | 48 | 163 | 175 | 163 | 347 |
SNPs (pyseer) | 0 | 27 | 1 | 0 | 134 | 48 | 6 | 249 |
k-mers (pyseer) | 820 | 18 | 7,078 | 101 | 1,734 | 866 | 180 | 14 |
SNPs (treeWAS) | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 4 |
Each value represents the number of unique genetic elements of a particular type found to be significantly associated with the phage host range phenotype.
Screen plot used to pick the number of dimensions for multidimensional scaling (MDS) in pyseer COG significance analysis. The number of dimensions (PCs) picked was the least possible (42), after which the eigenvalue stabilized with respect to dimension number. Download FIG S1, TIF file, 0.2 MB (223.4KB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
pyseer k-mer Q-Q plots for each phage (p0045, p0006, p0017, p0017S, p002y, p003p, p0040, and pyo). The observed P values were plotted relative to the expected P values based on the null distribution. Expected P values were plotted with a 95% confidence interval on the diagonal. Deviation of the observed/expected curve from the diagonal indicated P value inflation. Download FIG S2, TIF file, 0.9 MB (880.7KB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
GWAS approach and significant SNP annotations. (A) Overview of the genome-wide association study (GWAS) workflow. pyseer (42) associated intermediate-frequency COGs, core genome SNPs, and k-mers with each host range phenotype, while treeWAS (43) only associated core genome SNPs with each host range phenotype. SnpEff (112) classified mutation effects (synonymous, missense, or nonsense) from the corresponding Roary (102) gene sequence, while STRING (47) identified putative protein-protein interactions and PANTHER (48) identified enriched functions from lists of genes corresponding to each significant SNP or k-mer. (B) Classification of significantly associated pyseer or treeWAS SNPs based on mutational effect (synonymous, missense, or nonsense). SnpEff annotated SNP effects based on corresponding genes identified in the tested strains’ core genome with Roary. Phage 0045 was not included, as no significant SNPs were detected for its host range phenotype. Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple. Download FIG S3, TIF file, 0.8 MB (782.1KB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
treeWAS detected 4 or fewer significant SNPs for three phages and none for phages p0045, p0017, p0017S, and p002y. Among significant SNPs, the majority were synonymous for each phage, with the exception of phage p0040 (Fig. S3B). A single nonsense mutation was detected for phage p002y. The number of significant k-mers in or near a gene detected ranged from 14 (pyo) to 7,078 (p0017).
Searches using the entire set of GWAS loci for potential enriched protein-protein interactions and pathways in the STRING (47) and Gene Ontology (48) databases (using the PANTHER tool) (see Fig. S3A in the supplemental material; see Table S5 at https://figshare.com/articles/dataset/Supplemental_Table_S5/13355945, Table S6 at https://figshare.com/articles/dataset/Supplemental_Table_S6/13355948, and Table S7 at https://figshare.com/articles/dataset/Supplemental_Table_S7/13355951) resulted in a biologically diverse group of functions. These included periplasmic substrate binding (p0017S, STRING), type I restriction-modification specificity (p0017S, STRING), metal ion binding (p002y, STRING; pyo, STRING and PANTHER), ATP binding (p002y, STRING and PANTHER; pyo, STRING), amino acid metabolism (pyo, STRING and PANTHER), pyrimidine metabolism (pyo, STRING), and RNA metabolism (p0045, PANTHER). We note that the search results are limited to genes present in NCTC 8325 and must be interpreted accordingly.
Confirmation of causal roles for novel determinants of host range.
We next used molecular genetic experiments to confirm a causal role for genes discovered in the GWAS for which there were no previous references in the literature for a role in S. aureus phage host range. The genes (trpA, p002y/pyseer; phoR, p002y, p003p, p0040/pyseer; isdB, p002y, p0040/pyseer; sodM, p002y, p003p/pyseer; mprF/fmtC, p002y/pyseer; and relA, p003p/pyseer) were selected for validation because there were available transposon mutants in the Nebraska Transposon Mutant Library (NTML) (49) and these mutants could be backcrossed into the wild-type USA300 to eliminate second site mutations. We thus could not use transposon mutants that would confer full resistance (e.g., insertions in wall teichoic biosynthesis genes tarJ or tagH), as this resistance to phage infection would prevent lysate preparation for backcrossing. Nonetheless, we backcrossed selected mutants into their isogenic background USA300 JE2 and complemented these strains with the multicopy vector pOS1-Plgt (50).
We assessed the USA300 JE2 background, transposon mutants, transposon mutants with empty vectors, and complemented transposon mutants for growth defects and phage resistance with the previously described high-throughput (Fig. 5; see Fig. S6 in the supplemental material) and efficiency of plating (EOP) assays (Fig. 5 and Fig. S7), respectively. No strains had growth defects with respect to each other or the wild-type background (Fig. S4). We found significant decreases in phage resistance for all mutants in the presence of phages p0006, p0017S, p003p, and p0040 (P < 0.05, Wilcoxon signed-rank test). However, when we attempted to rescue the phenotype by complementation, we found only corresponding rescue of phage resistance back toward the wild-type phenotype in trpA, phoR, sodM, and fmtC (P < 0.05, Wilcoxon signed-rank test). Interestingly, the fmtC allele from NRS209 did not complement the fmtC::Tn insertion, while the fmtC allele from the same strain (USA300 JE2) did, suggesting allele specificity for fmtC in phage resistance effects. As found in growth curves (Fig. S4), in the high-throughput assay, for the most part, mutations and plasmids did not affect bacterial growth in the absence of phage (Fig. 5, no-phage panel; Fig. S6). We further evaluated bacterial survival after the high-throughput assay by measuring the number of CFU in assay soft agar after overnight culture for the trpA set of strains and phage p003p. As expected, the number of surviving CFU correlated with final OD, with significantly (P < 0.05, Wilcoxon signed-rank test) higher number of CFU and OD for USA300 JE2 than USA300 trpA::Tn and for USA300 trpA::Tn pOS1 trpA than USA300 trpA::Tn pOS1 (Fig. S5).
FIG 5.
Molecular genetics validates putative phage resistance determinants. High-throughput host range assay (top) and efficiency of plating (EOP) (bottom) phenotypes demonstrating genetic validation of novel GWAS phage host range determinants are shown. Results are grouped by gene (trpA, phoR, isdB, sodM, fmtC, and relA). All assays were performed with siphovirus p003p or no phage. Each gene group includes four strains demonstrating complementation with proper controls (USA300, USA300 transposon mutant, USA300 transposon mutant with empty pOS1 vector, and USA300 transposon mutant complemented with gene in pOS1 vector). All significant (P < 0.05) pairwise differences (Wilcoxon signed-rank test) are indicated at the top of the corresponding box plots (ns, not significant; *, P = 0.01 to 0.05; **, P = 0.001 to 0.01; ***, P = 0.0001 to 0.001; ****, P = 0 to 0.0001).
Growth curves of USA300, USA300 transposon mutants (A), transposon mutants electroporated with the empty pOS1 vector (B), and transposon mutants complemented with vectors containing the respective genes (C) (trpA, phoR, isdB, sodM, fmtC, and relA). Strains were inoculated with a 96-pin replicator from arrayed frozen glycerol stocks into 96-well plates containing 200 μl LB-TSB 2:1 with 5 mM CaCl2 or the same medium supplemented with 10 μg/ml chloramphenicol in each well. We then diluted each culture 1:100 in fresh LB-TSB 2:1 with 5 mM CaCl2 or the same medium supplemented with 10 μg/ml chloramphenicol and collected growth curves on a BioTek Eon plate reader (37°C, 225 rpm agitation, OD600 measured every 10 min). Download FIG S4, TIF file, 1.4 MB (1.4MB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Bacterial survival after completion of the high-throughput host range assay (p003p against trpA strains). The high-throughput assay was performed for six biological replicates of USA300, USA300 trpA::Tn, USA300 trpA::Tn pOS1, and USA300 trpA::Tn pOS1 trpA strains. (A) ODs were measured for the high-throughput phage host range assay replicates as described previously. (B) Agar plugs were removed with toothpicks and transferred to 0.8-ml volumes of sterile TMG, and bacteria were resuspended by vortexing. The resuspensions were serially diluted in TMG, and 4 μl of 1e−1 through 1e−6 dilutions were spotted four times on TSA plates. Dilution plates were grown overnight at 37°C, and colonies were counted the following day to determine the number of surviving CFU under each condition. Download FIG S5, TIF file, 0.4 MB (420.5KB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
High-throughput host range assay phenotypes demonstrating genetic validation of novel GWAS phage host range determinants. Results are grouped by gene (trpA, phoR, isdB, sodM, fmtC, and relA) and phage (p0045, p0017S, p003p, p0040, p0006, p002y, pyo, and no phage). Each group includes four strains demonstrating complementation with proper controls (USA300, USA300 transposon mutant, USA300 transposon mutant with empty pOS1 vector, and USA300 transposon mutant complemented with gene in pOS1 vector). All significant (P < 0.05) pairwise differences (Wilcoxon signed-rank test) are shown at the top of the corresponding box plots. Siphoviridae are listed in red, Myoviridae in blue, and the no-phage control in gray. Download FIG S6, PDF file, 0.1 MB (110.8KB, pdf) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Efficiency of plating (EOP) phenotypes demonstrating genetic validation of phage host range determinants. Undiluted through 1e−8 dilutions of phage were spotted (4 μl) three times on each top agar lawn, allowed to dry, and incubated face up overnight at 37°C, and plaques were counted at the lowest countable dilution. EOP was calculated relative to the average PFU/ml for the control strain, USA300 JE2. Results are grouped by gene (trpA, phoR, isdB, sodM, fmtC, and relA) and phage (p0045, p0017S, p003p, p0040, p0006, p002y, and pyo). Siphoviridae are listed in red, and Myoviridae in blue. Each group includes four strains demonstrating complementation with controls (USA300, USA300 transposon mutant, USA300 transposon mutant with empty pOS1 vector, and USA300 transposon mutant complemented with gene in pOS1 vector). All significant (P < 0.05) pairwise differences (Wilcoxon signed-rank test) are shown at the top of the corresponding boxplots. Download FIG S7, PDF file, 0.04 MB (38.8KB, pdf) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
We did not observe any significant changes in phage propagation efficiency when performing the efficiency of plating (EOP) assay on these strains, except for USA300/USA300 trpA::Tn pOS1 trpA, USA300/USA300 phoR::Tn pOS1, and USA300/USA300 relA::Tn pOS1 relA (P < 0.05, Wilcoxon signed-rank test). EOP measures differences in plaquing or actual infection and phage propagation. The growth-based assay measures survival despite infection. We interpreted the different results between the EOP and growth assays to indicate that these genes (trpA, phoR, isdB, sodM, fmtC, and relA)mostly influence survival postinfection and do not necessarily prevent infection. Taken together, these results confirmed that at least six GWAS-significant genes are implicated in phage resistance for some of the eight phages but not necessarily at the level of direct interference with phage propagation.
Host range predictive models based on significant genetic determinants explain most phenotypic variation.
In order to determine the extent to which host range is predictable by the loci identified by GWAS, we constructed predictive models for qualitative host range phenotypes using random forests, gradient-boosted decision trees, and neural networks. We determined predictive accuracy for each phage host range phenotype and four different sets of predictors (presence/absence of significant genetic determinants or k-mers from GWAS result, with or without sequence type and clonal complex for corresponding strains) with 10-fold cross-validation (Fig. 6A; Fig. S8A). In no cases were there significant differences in 10-fold cross-validation predictive accuracies between model construction methods or predictor sets used, suggesting that no combination of method and predictors improved model predictive accuracy relative to another and that there is a limit to the amount of host range variation explained by the predictive models. The phages p0017S (predictive accuracy, 0.83 to 0.87), p002y (0.81 to 0.88), p003p (0.83 to 0.92), and pyo (0.83 to 0.91) had the highest average predictive accuracies, followed by p0045 (0.67 to 0.73), p0006 (0.47 to 0.61), p0040 (0.42 to 0.61), and p0017 (0.45 to 0.54). We hypothesized that predictive accuracy correlated with host range distribution, expecting simpler distributions to be easier to predict and thus to have higher predictive accuracies. We thus examined the relationship between information entropy (average level of uncertainty or information in a variable’s possible outcomes) and predictive accuracy (Fig. 6B and C; Fig. S8B and C). We found that predictive accuracy increased at the extremes of phenotype proportion (S, SS, R) and that information entropy was negatively correlated with predictive accuracy for all models.
FIG 6.
Construction of predictive models for each ternary phage resistance phenotype. Quantitative host range phenotypes were classified as sensitive (S), semisensitive (SS), or resistant (R) based on the bins (OD600, 0.1 to 0.4, 0.4 to 0.7, and 0.7 or more, respectively). Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple. (A) Tenfold cross-validation predictive accuracies for each phage based on two model building methods (randomForest and XGBoost) and four sets of predictors, all significant GWAS genetic determinants (COGs, SNPs, and k-mers) for a particular phage, all determinants plus corresponding strain sequence type and clonal complex (ST and CC), significant k-mers for a particular phage, and significant k-mers plus strain ST and CC. Average accuracies of four 10-fold cross-validation (CV) replicates are presented with 1 standard error above and below the mean. Validation accuracy represents the proportion of correctly identified ternary phenotypes in the validation set (one-tenth of the strain set). (B) Average accuracies from four 10-fold CV replicates for each model building method and all significant GWAS determinants as predictors relative to the proportion of each ternary phenotype (S, SS, or R) among tested strains for the corresponding phage. Three points are shown for each validation accuracy result (corresponding to each of the three possible phenotypes). (C) Average accuracies from four 10-fold CV replicates for each model building method and all significant GWAS determinants as predictors relative to the information entropy for each host range phenotype, which was calculated as described in Materials and Methods. Information entropy was calculated with a natural logarithm in natural units (nats).
Construction of neural network predictive models for each ternary phage resistance phenotype. Quantitative host range phenotypes were classified as sensitive (S), semisensitive (SS), or resistant (R) based on the bins (OD600, 0.1 to 0.4, 0.4 to 0.7, and 0.7 or more, respectively). Data preprocessing included oversampling (p0045, p0017S, p002y, p003p, or pyo), lasso regression (p0017), both (p0006), or neither (p0040). (A) Predictive accuracies for each phage based on neural networks and four sets of predictors: all significant GWAS genetic determinants (COGs, SNPs, and k-mers) for a particular phage, all determinants plus corresponding strain sequence type and clonal complex (ST and CC), significant k-mers for a particular phage, and significant k-mers plus strain ST and CC. Average accuracies of four replicates are presented with 1 standard error above and below the mean. Validation accuracy represents the proportion of correctly identified ternary phenotypes in the validation set (30% of the strain set). (B) Average accuracies from four replicates and all significant GWAS determinants as predictors relative to the proportion of each ternary phenotype (S, SS, or R) among tested strains for the corresponding phage. Three points on the same horizontal are shown for each validation accuracy result (corresponding to each of the three possible phenotypes). (C) Average accuracies from four replicates and all significant GWAS determinants as predictors relative to the information entropy for each host range phenotype, which was calculated as described in Materials and Methods. Information entropy was calculated with a natural logarithm in natural units (nats). Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple. Download FIG S8, TIF file, 1.0 MB (1MB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
We also performed the same analyses on another predictive model statistic, the receiver operating characteristic (ROC) curve area under the curve (AUC), which measures the ability of the model to distinguish between classes (true positive and true negative). We found that gradient-boosted decision tree AUCs held uniform among phages, while random forest and neural network AUCs negatively correlated with information entropy (Fig. S9 and S10), suggesting that phenotype complexity (entropy) did not affect the robustness of gradient-boosted decision tree prediction. Taken together, these results show that significant GWAS determinants from this study do not completely predict phage host range and that prediction is most effective for low-complexity host range distributions, at least for random forest and neural network models.
Evaluation of ternary phage resistance phenotype predictive models through receiver operating characteristic (ROC) area under the curve (AUC). Quantitative host range phenotypes were classified as sensitive (S), semisensitive (SS), or resistant (R) based on the bins 0.1 to 0.4, 0.4 to 0.7, and 0.7 or more (OD600), respectively. Data preprocessing included oversampling (p0045, p0017S, p002y, p003p, or pyo), lasso regression (p0017), both (p0006), or neither (p0040). (A) Tenfold cross-validation ROC AUCs for each phage based on two model building methods (randomForest and XGBoost) and four sets of predictors: all significant GWAS genetic determinants (COGs, SNPs, and k-mers) for a particular phage, all determinants plus corresponding strain sequence type and clonal complex (ST and CC), significant k-mers for a particular phage, and significant k-mers plus strain ST and CC. Average ROC AUCs of four 10-fold CV replicates are presented with q standard error above and below the mean. (B) Average ROC AUCs from four 10-fold CV replicates for each model building method and all significant GWAS determinants as predictors relative to the proportion of each ternary phenotype (S, SS, or R) among tested strains for the corresponding phage. Three points are shown for each ROC AUC (corresponding to each of the three possible phenotypes). (C) Average ROC AUCs from four 10-fold CV replicates for each model building method and all significant GWAS determinants as predictors relative to the information entropy for each host range phenotype, which was calculated as described in Materials and Methods. Information entropy was calculated with a natural logarithm in natural units (nats). Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple. Download FIG S9, TIF file, 1.5 MB (1.5MB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Evaluation of ternary phage resistance phenotype neural network predictive models through receiver operating characteristic (ROC) area under the curve (AUC). Quantitative host range phenotypes were classified as sensitive (S), semisensitive (SS), or resistant (R) based on the bins 0.1 to 0.4, 0.4 to 0.7, and 0.7 or more (OD600), respectively. (A) ROC AUCs for each phage based on neural network models and four sets of predictors: all significant GWAS genetic determinants (COGs, SNPs, and k-mers) for a particular phage, all determinants plus corresponding strain sequence type and clonal complex (ST and CC), significant k-mers for a particular phage, and significant k-mers plus strain ST and CC. Average ROC AUCs of four replicates are presented with 1 standard error above and below the mean. (B) Average ROC AUCs from four replicates and all significant GWAS determinants as predictors relative to the proportion of each ternary phenotype (S, SS, or R) among tested strains for the corresponding phage. Three points are shown for each ROC AUC (corresponding to each of the three possible phenotypes). (C) Average ROC AUCs from four replicates and all significant GWAS determinants as predictors relative to the information entropy for each host range phenotype, which was calculated as described in Materials and Methods. Information entropy was calculated with a natural logarithm in natural units (nats). Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple. Download FIG S10, TIF file, 1.0 MB (1MB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
DISCUSSION
Through GWAS using a diverse natural set of S. aureus strains, we discovered numerous genetic determinants of phage host range, many of which had not been reported previously in the scientific literature. This study uses a far more diverse set of strains than the previous hypothesis-free study of S. aureus phage host range (36). However, our set of genetic loci still only partially explains the variation in the overall broad host ranges of our tested phages, as the predictive modeling results indicate.
We found that knockouts of six GWAS-significant genes. i.e., trpA, phoR, isdB, sodM, fmtC, and relA, increased phage sensitivity, suggesting that these could be targets for phage therapy adjunctive drugs. trpA together with trpB (encoding tryptophan synthase alpha and beta chains, respectively) carries out the last step in l-tryptophan biosynthesis (51). The enzymes convert indole-glycerol phosphate and serine to tryptophan and glyceraldehyde 3-phosphate (51). TrpA inactivation might then sensitize S. aureus to phage infection by increasing indole-glycerol phosphate levels at the expense of tryptophan. In the absence of trpA, built-up tryptophan biosynthesis intermediates including indole-glycerol phosphate may sensitize cells to phage infection, making trpA necessary for resistance. Alternatively, by reducing the total tryptophan pool, removing tryptophan biosynthesis may increase the proportion of tryptophan used to translate phage relative to host proteins, thus enhancing phage infection at the cost of host growth. Indeed, it is already known that throttling down protein synthesis with sublethal doses of ribosomal active antibiotics enhances plaque formation on MRSA lawns (52).
The PhoPR two-component system is responsible for regulating expression of phosphate uptake systems (ABC transporters) based on phosphate levels. In S. aureus, PhoPR is necessary for growth under phosphate-limiting conditions by regulating either phosphate transporters or other factors, depending on the environment (53). In Bacillus subtilis, the sensor kinase PhoR senses phosphate limitation through wall teichoic acid (WTA) intermediates (54) and correspondingly represses WTA biosynthesis gene expression (55). PhoPR also upregulates glycerol phosphate WTA degradation in S. aureus and B. subtilis to scavenge phosphate (56, 57). If all these mechanisms are present in S. aureus, and if there is also a pathway for degrading S. aureus ribitol phosphate (Rbo-P) WTA, PhoR activity may lead to reduced WTA under phosphate starvation, thus forming phage-resistant cells. On the other hand, as for trpA, phoR might simply be required for properly inducing the phosphate uptake necessary for survival during phage infection.
Superoxide dismutase (SodM) and phosphatidylglycerol lysyltransferase/multiple peptide resistance factor (FmtC/MprF) more likely have direct mechanistic roles in the phage infection process. SodM may be required for tolerance to cell wall stress imposed by phage infection. SodM is a Mn/Fe-dependent superoxide dismutase that converts superoxide into hydrogen peroxide and oxygen. Previous studies have shown that superoxide dismutase has affected tolerance to cell wall active antibiotics in S. aureus and Enterococcus faecalis (58, 59) and phage plaquing in Campylobacter jejuni (60). Superoxide dismutase transcripts were found to be upregulated upon phage infection in E. faecalis (61). FmtC, on the other hand, may affect the lysis step by altering cell surface charge. FmtC (MprF) alters cell surface charge first by attaching the positively charged lysine to phosphatidylglycerol through esterification with glycerol (62, 63). It then flips these modified phospholipids from the inner to the outer leaflet of the cell membrane (64). This resulting positive charge on the outer membrane confers resistance to cationic antimicrobial peptides (CAMPs) but may also alter lysis. Phage lysis depends on holin proteins, which form pores in the membrane that dissipate proton motive force and release endolysins to degrade the cell wall peptidoglycan (65–67). Because FmtC alters cell surface charge, it also could affect holin-dependent membrane depolarization, endolysin activity, or phage attachment, especially if the phage receptor-binding protein is positively charged. Interestingly enough, the fmtC allele from NRS209 did not complement the transposon insertion in USA300 JE2. This could indicate either a loss of function in the allele or incompatibility with some aspect of the USA300 JE2 strain.
Two of the six validated genes did not restore wild-type phenotypes upon complementation (isdB and relA). RelA, or the relA/spoT homolog in S. aureus, synthesizes (p)ppGpp in response to sensing uncharged tRNAs on the ribosome (68). Transcriptomic studies indicated that S. aureus upregulates its relA/spoT homolog in response to lytic phage predation (69). RelA may contribute to phage-resistant, slow-growing cell (persister) formation (70), although studies indicate that ATP depletion rather than (p)ppGpp synthesis accounts for persistence in S. aureus (71). IsdB, on the other hand, is part of the iron-regulated surface determinant (Isd) system responsible for scavenging iron from hemoglobin (72). As experiments were conducted in rich medium, the hemoglobin iron-scavenging activity of IsdB does not seem relevant, but IsdB may be an abundant surface protein, implicating it in surface occlusion. Neither isdB nor relA is in an operon, at least in USA300 JE2. It might be that the native promoters are inherently stronger than the Plgt promoter or are strongly upregulated during phage infection, thus affecting the efficiency of complementation. We also note for all genes that there was no apparent complementation for phages p002y and pyo (Fig. S6). In the case of the latter two, the parental USA300 JE2 strain was already sensitive to those two phages.
These validated genes along with most other GWAS-detected host range factors have not been previously reported as important in S. aureus phage infection, but the GWAS did identify some known factors. Such factors included WTA biosynthesis and modification genes tarP, tarJ, and tagH. While TarJ and TagH are involved in WTA biosynthesis, the WTA glycosyltransferase TarP was recently shown to directly confer Podoviridae resistance. Capsule biosynthesis (cap8A and cap8I) (73) and peptidoglycan modification (oatA) genes (45) encode surface-associated functions previously implicated in S. aureus phage resistance. Capsule or capsule overproduction is known to confer phage resistance in S. aureus (7, 12), while peptidoglycan O-acetyl groups are part of the phage receptor (45). Type I restriction-modification (hsdS) was implicated as well, and this is a well-known mechanism for suppression of infection across clonal complexes (37). Staphylococcal pathogenicity islands (SaPIs) were not implicated, most likely because these are highly specific to siphovirus helper phages, and even for possibly affected helper phages (80α), SaPI interference reduces but does not eliminate helper phage production (74). This means that our high-throughput assay may not capture SaPI-level effects, as it does not directly measure phage propagation through plaquing efficiency. CRISPRs were not significant in our study either, because these are rare in S. aureus strains (1, 20, 21).
Our study agreed with prior work demonstrating that S. aureus phages have broad host ranges (28–34). A major goal of our work was to create prototype predictive models for host range based on genome sequence. Genome-based predictions for several antibiotic resistance phenotypes have proven to be of similar accuracy to classic laboratory-based assays (75). We found that S. aureus host range prediction accuracy was 40 to 95%, depending on the phage. More strains and phages will need to be added to the host range matrix to make genomic host range prediction clinically useful. The difficulty in predicting resistance may come from the large number of genes found to influence the phenotype. Resistant strains may instead have individual, unique mechanisms or other traits that simply confer phage resistance, with the exception of superinfection immunity, in which host-encoded prophages prevent infection of a cognate temperate phage by repressing its lytic genes with their cI repressors (46). The two phages with the highest overall resistance (p0045 and p0040 [Fig. 2]) are temperate Siphoviridae. Most isolated S. aureus strains encode prophages (76), making superinfection immunity and the corresponding overall p0045 and p0040 resistance common in the tested strains.
There are limitations to performing phage host range measurement. The high-throughput assay did not measure lysis directly but also did not have the disadvantages of observer bias, low throughput, and qualitative output of the spot assay. In our host range assay, we measure the ability of the population overall to survive phage challenge, but this could also indicate the phage suppression of bacterial growth through some level of infection. Likewise, multiple possible sets of population dynamics confound the spot assay. Efficiency of plating (EOP), on the other hand, measures phage propagation efficiency directly by comparing phage titer on a permissive control strain to that on a test strain (77). Nonetheless, factors altering EOP still might affect any stage of the infection cycle, so EOP measurement does not suggest a possible phage resistance mechanism. The ambiguity of both assays suggests that examining the population dynamics of phages and identified mutants (e.g., trpA::Tn) during infection (i.e., adsorption rate, latent period, and burst size from a one-step growth curve) would be worthy for future studies to pinpoint the specific mechanism by which that gene affects phage resistance. We also recognize that a multitude of environmental variables (temperature, multiplicity of infection, growth medium) might influence the assay.
There are also some limitations inherent in GWAS approaches. Bacterial GWAS associates homoplasic variants that arise from parallel evolution or recombination with a phenotype of interest (78, 79). While bacterial GWAS can find more types of genetic events (either loss of function or gain of function, mutation, insertion, deletion, recombination, and so on, but not genes with no changes) and more broadly relevant genes and polymorphisms related to a phenotype than screening transposon mutants in a single genetic background, clonal population structure, abundant small effect variants, and genetic interactions hamper it (78). When recombination is relatively rare in a species, like S. aureus, large numbers of variants remain in linkage disequilibrium, making it difficult to distinguish lineage from strain-level effects. Loci linked to a causative variant may then be detected as false positives. While we have validated at least a few genes as true positives, and we expect phylogenetically hierarchical effects on host range based on reviewing past work (1), our GWAS methods also include various corrections for clonal population structure when variants are associated.
Two recent studies used single gene knockout, overexpression, and transcriptional suppression methods as well as global transcriptional profiling to identify phage resistance determinants in Escherichia coli (80) and Enterococcus faecalis (61). Unlike these previous studies, our findings are not limited to one or a few genetic backgrounds, making them more widely applicable to the species and its underlying evolution. Nonetheless, extensive functional molecular genetics studies will be needed to distinguish genes that truly contribute to host range from false positives. These studies, like those in E. coli and E. faecalis, would complement the GWAS with global searches for phage resistance genes in a single genetic background, such as transposon insertion sequencing (Tn-Seq), dual-barcoded shotgun expression library sequencing (DUB-Seq), and CRISPR interference to identify genes required for surviving phage infection and transcriptome sequencing (RNA-Seq) to identify genes differentially regulated in response to phage infection. Such work would both corroborate GWAS results and fill in the gaps, possible determinants not present or conserved in enough of the resistant or sensitive population.
Our results have important consequences for phage therapy, phage-small molecule combination therapy, and horizontal gene transfer in the species. The genes identified expand the set of potential combination therapies by providing additional targets to which to design small molecules to interfere with phage resistance. Already, combination phage-antibiotic therapies have shown promise for clearing biofilms and reducing emergence of antibiotic resistance in S. aureus (81), and ribosomal active antibiotics are known to enhance MRSA phage sensitivity at sublethal doses (52). Additionally, because the phage receptor WTA is necessary for methicillin resistance (39) and WTA inhibition resensitizes MRSA to methicillin (40), phages have the exciting possibility of inducing collateral beta-lactam sensitivity. We also cannot discount the possibility that phage resistance polymorphisms are the result of selection by other stresses besides phage infection, such as immune escape, interbacterial interactions, or antibiotic selection. Wall teichoic acid, the S. aureus phage receptor, for example, is also important for colonization, antibiotic resistance, and immune evasion (39, 82–87). Because we identified phage host range determinants, we also gained insights into the evolution of the S. aureus through horizontal gene transfer (HGT). Transduction, the transfer of host genetic material between strains by abortive phage infection, is a major mechanism of HGT (88) and recombination (89) in the species. There is a trade-off between the need to resist phage killing and the need to adapt by gaining new virulence genes (such as Panton-Valentine leukocidin) (90) through HGT. It is possible that the most transducible strains are both more sensitive to killing by phage infection and more able to outcompete other strains for advantageous genetic material. The finding that even the most resistant strains (NRS148, NRS209, and NRS255) were still sensitive to two out of the eight phages may be the result of a selection for sensitivity that could be the Achilles’ heel of S. aureus when confronted by phage therapy.
MATERIALS AND METHODS
Strains, media, and phage propagation.
Phages used in this study were phage p0045 (80α-like), p0017S, p002y (DI), p003p (Mourad 87), and p0040 (Mourad 2) (Siphoviridae); p0006 (K) and pyo (Myoviridae); and p0017 (HER49/p66) (Podoviridae). All phage genomic DNA was isolated with the bioWORLD phage DNA isolation kit by following the manufacturer’s directions after phage precipitation by a previously described protocol. The corresponding genomes were prepared for sequencing with a one-dimensional (1D) ligation sequencing kit (SQK-LSK109) or 1D rapid sequencing kit (SQK-RAD004) and sequenced with an Oxford Nanopore MinION using a Flongle flow cell (FLO-FLG001). Phages p0045, p0017S, p002y (DI), p003p (Mourad 87), p0040 (Mourad 2), and p0006 (K) genomes were also sequenced with Illumina technology by the Microbial Genome Sequencing Center (MiGS) at the University of Pittsburgh.
All Siphoviridae and Myoviridae were propagated in S. aureus RN4220, while the sole podovirus was propagated in S. aureus RN4220 tarM::Tn, which was constructed by transducing strain RN4220 with Nebraska Transposon Mutant Library (NTML) (49) strain USA300 JE2 tarM::Tn (NE611) phage 0045 lysate. Strains, phages, and plasmids used for phage propagation and molecular genetic validation of GWAS results are listed in Table 4. Transduction was performed according to a previously published protocol (91). All overnight cultures were grown in LB-Trypticase soy broth at 2:1 (LB-TSB 2:1) supplemented with 5 mM CaCl2 to promote phage adsorption.
TABLE 4.
Strains, phages, and plasmids used for phage propagation and molecular genetic validation of GWAS results
Strain, phage, or plasmid | Characteristics/description | Reference(s) |
---|---|---|
E. coli strains | ||
DH5ɑ | E. coli cloning strain; F− endA1 glnV44 thi-1 recA1 relA1 gyrA96 deoR nupG purB20 φ80dlacZΔM15 Δ(lacZYA-argF)U169 hsdR17 (rK– mK+) λ− | 119 |
IM08B | E. coli cloning strain with S. aureus CC8 DNA modification; DNA cytosine methyltransferase (dcm)-negative mutant of E. coli K-12 DH10B; mcrA Δ(mrr-hsdRMS-mcrBC) φ80lacZΔM15 ΔlacX74 recA1 araD139 Δ(ara-leu)7697 galU galK rpsL endA1 nupG Δdcm ΩPhelp-hsdMS (CC8-2) ΩPN25-hsdS (CC8-1) | 114 |
S. aureus strains | ||
RN4220 | Phage propagation strain; background for transducing tarM::Tn; cloning intermediate for pOS1-Plgt-fmtC | 120 |
RN4220 tarM::Tn | Podovirus (p0017) propagation strain; generated by transducing RN4220 with USA300 JE2 tarM::Tn (NE611) phage 0045 lysate | This study |
USA300 JE2 | Wild-type for genetic validation experiments and background for transposon mutant backcrossing | 49 |
USA300 JE2 tarM::Tn (NE611) | Transposon mutant transduced into RN4220 to make RN4220 tarM::Tn by a NE611 phage 0045 lysate | 49 |
USA300 JE2 trpA::Tn | Mutant NE304 backcrossed into USA300 JE2 | This study |
USA300 JE2 trpA::Tn pOS1 | Complemented backcrossed mutant with empty vector | This study |
USA300 JE2 trpA::Tn pOS1 trpA | Complemented backcrossed mutant with trpA from USA300 JE2 | This study |
USA300 JE2 phoR::Tn | Mutant NE618 backcrossed into USA300 JE2 | This study |
USA300 JE2 phoR::Tn pOS1 | Complemented backcrossed mutant with empty vector | This study |
USA300 JE2 phoR::Tn pOS1 phoR | Complemented backcrossed mutant with phoR from USA300 JE2 | This study |
USA300 JE2 isdB::Tn | Mutant NE1102 backcrossed into USA300 JE2 | This study |
USA300 JE2 isdB::Tn pOS1 | Complemented backcrossed mutant with empty vector | This study |
USA300 JE2 isdB::Tn pOS1 isdB | Complemented backcrossed mutant with isdB from USA300 JE2 | This study |
USA300 JE2 sodM::Tn | Mutant NE1224 backcrossed into USA300 JE2 | This study |
USA300 JE2 sodM::Tn pOS1 | Complemented backcrossed mutant with empty vector | This study |
USA300 JE2 sodM::Tn pOS1 sodM | Complemented backcrossed mutant with sodM from USA300 JE2 | This study |
USA300 JE2 fmtC::Tn | Mutant NE1360 backcrossed into USA300 JE2 | This study |
USA300 JE2 fmtC::Tn pOS1 | Complemented backcrossed mutant with empty vector | This study |
USA300 JE2 fmtC::Tn pOS1 fmtC | Complemented backcrossed mutant with fmtC from USA300 JE2 | This study |
USA300 JE2 fmtC::Tn pOS1 fmtC209 | Complemented backcrossed mutant with fmtC from NRS209 | This study |
USA300 JE2 relA::Tn | Mutant NE1714 backcrossed into USA300 JE2 | This study |
USA300 JE2 relA::Tn pOS1 | Complemented backcrossed mutant with empty vector | This study |
USA300 JE2 relA::Tn pOS1 relA | Complemented backcrossed mutant with relA from USA300 JE2 | This study |
Phages | ||
p0045 (80α-like) | Siphoviridae phage; also used for backcrossing and pOS1-Plgt-fmtC transduction from RN4220 into USA300 fmtC::Tn | 5, 6, 121 |
p0006 (K) | Myoviridae phage; GenBank accession no. NC_005880.2 | 30, 31, 122 |
p0017 (HER49/p66) | Podoviridae phage; GenBank accession no. NC_007046.1 | This study |
p0017S | Siphoviridae phage | This study |
p002y (DI) | Siphoviridae phage | This study |
p003p (Mourad 87) | Siphoviridae phage | This study |
p0040 (Mourad 2) | Siphoviridae phage | This study |
pyo | Myoviridae phage; BioProject accession no. PRJNA477834 | 81, 123 |
Plasmids | ||
pOS1-Plgt | Empty complementation vector | 50 |
pOS1-Plgt-trpA | Complementation vector with trpA cloned downstream of Plgt | This study |
pOS1-Plgt-phoR | Complementation vector with phoR cloned downstream of Plgt | This study |
pOS1-Plgt-isdB | Complementation vector with isdB cloned downstream of Plgt | This study |
pOS1-Plgt-sodM | Complementation vector with sodM cloned downstream of Plgt | This study |
pOS1-Plgt-fmtC | Complementation vector with fmtC cloned downstream of Plgt | This study |
pOS1-Plgt-fmtC209 | Complementation vector with fmtC from NRS209 cloned downstream of Plgt | This study |
pOS1-Plgt-relA | Complementation vector with relA cloned downstream of Plgt | This study |
Phages were propagated by inoculating a chunk of soft agar containing a plaque and surrounding bacteria into liquid medium. Phage lysates in TMG (Tris-magnesium-gelatin) buffer were spotted (4 μl) on a top agar (0.8% agar, 0.8% NaCl) lawn (5 ml) containing 0.2 ml of a 1:10 dilution of an RN4220 or RN4220 tarM::Tn overnight culture (18 h of growth, 37°C, 250 rpm). After overnight growth at 37°C, a chunk of soft agar containing a plaque and surrounding bacteria was inoculated into 35 ml of LB-TSB 2:1 with 5 mM CaCl2. This phage-bacterium coculture was grown overnight at 37°C and 250 rpm, centrifuged for 20 min at 4,000 rpm, and filtered with a 0.45-μm syringe filter before being stored at 4°C. The resulting lysate was titered on RN4220 (Siphoviridae or Myoviridae) or RN4220 tarM::Tn (Podoviridae).
Phage resistance/host range assays.
Two hundred fifty-nine previously genome-sequenced S. aureus strains consisting of 126 strains from the Network on Antimicrobial Resistance in Staphylococcus aureus (NARSA) repository (NCBI BioProject accession no. PRJNA289526) (92), 69 strains previously sequenced in a vancomycin-intermediate S. aureus (VISA) study (93) (PRJNA239001), and 64 strains previously sequenced in a cystic fibrosis (CF) lung colonization study (94) (PRJNA480016) were rapidly profiled for resistance to the eight phages using a high-throughput assay. Arrayed glycerol (50%) stocks of the strains were used to inoculate 96-well plates containing 200 μl of LB-TSB 2:1 with 5 mM CaCl2 in each well using a 96-pin replicator. Cultures were grown overnight at 37°C and 225 rpm. The following day, overnight cultures were diluted 1:10 in double-distilled water (ddH2O). In order to permit phage adsorption, 10 μl of each phage lysate (∼1e9 PFU/ml) was coincubated with 10 μl of each overnight culture dilution for 30 min at room temperature in 96-well plates. A 200-μl volume of molten LB-TSB-CaCl2 agar (LB-TSB 2:1 with 5 mM CaCl2 and 0.4% agar) was then added to each well containing the culture-phage mixtures and allowed to solidify. After incubation overnight (37°C), the plates were photographed and final optical densities at 600 nm (OD600) per well were measured using a plate reader (BioTek Eon). Strains were categorized as sensitive (OD600, 0.1 to 0.4), semisensitive (0.4 to 0.7), or resistant (0.7 or greater) on the basis of classifying the average final OD600 from at least six replicates into three equal bins (with the third bin counting outlier resistant strains with OD600s above 1). Strains and host range phenotypes (quantitative and quantitative converted to ternary) are listed in Tables S1 and S2.
High-throughput assays were also calibrated against a standard spot assay. One hundred eight NARSA strains were tested for resistance to five of the eight phages listed previously (phages p0045, p0006, p0017S, p002y, and p003p). Briefly, an overnight culture of each strain was diluted 1:10 in ddH2O, and a top agar lawn (0.2 ml of dilution per 5 ml molten top agar) was poured on a Trypticase soy agar (TSA) plate. After solidification, each of the five lysates was spotted (4 μl) twice on the top agar lawn and allowed to dry. The plates were then incubated face up overnight at 37°C, and the spots were evaluated for clearing (sensitive), turbid clearing (semisensitive), or no clearing (resistant) the following day. High-throughput assay and spot assay phenotypes were compared in box plots made with ggplot2 (95). The statistical significance of high-throughput assay phage resistance differences between all possible pairs of sensitive (S), semisensitive (SS), and resistant (R) strains was assessed with Wilcoxon signed-rank tests.
Bioinformatic processing.
Phage p0017 and pyo genomes were assembled from Oxford Nanopore reads with canu 2.0 (96). Hybrid Illumina/Nanopore phage genome assemblies were constructed using Unicycler 0.4.8, filtering for contigs with coverage higher than 5× (97). The average nucleotide identity (ANI) was then determined among all phage contigs using fastANI 1.31 (98), which is shown as a lower-triangle identity matrix in Table S1. All S. aureus genomes were processed using the Staphopia analysis pipeline (99), which included de novo assembly using SPAdes (100) and annotation using Prokka (101). The core genome phylogenetic tree was constructed by first determining the core genome alignment for all tested strains with Roary (102), correcting for recombination with Gubbins (103), and then generating a maximum-likelihood phylogenetic tree with IQ-TREE (104). Strains (253 in total) for which there are corresponding phage resistance phenotypes (quantitative and qualitative), BioProject, BioSample, and SRA accessions, sequence types, clonal complexes, isolation years, and isolation locations are listed in Table S2. MLST (multilocus sequence typing) sequence types were identified for each genome with the mlst command line tool (105), which uses the PubMLST website (https://pubmlst.org/) (106). Quantitative phage resistance phenotypes were annotated on the tree using the Interactive Tree of Life (iTOL) (107).
Preliminary phenotype analysis.
Phage resistance phenotypes were initially placed on a core genome phylogenetic tree and were associated with two factors, clonal complex (CC) and MRSA/MSSA genetic background. Phage resistance associations with CC and MRSA/MSSA were visualized in box plots made with ggplot2 (95). Statistical significance of phage resistance differences between MRSA/MSSA was determined with Wilcoxon signed-rank tests. Statistical significance of overall phage resistance differences between represented CCs was determined using one-way analysis of variance (ANOVA) tests with or without phylogenetic correction.
Measuring phylogenetic signal.
Four different measures of phylogenetic signal were calculated for each phenotype: Abouheif’s Cmean, Moran’s I, Pagel’s λ, and Blomberg’s K (41). Abouheif’s Cmean and Moran’s I were calculated with the abouheif.moran function from the adephylo R package (108), while Pagel’s λ and Blomberg’s K were calculated using the phylosig function from the phytools R package (109). Phylogenetic signal was determined using the core genome phylogenetic tree annotated with quantitative phage resistance data previously described. Randomization tests for phylogenetic signal calculation were performed with 999 permutations of the data.
GWAS.
Genotypes were associated with phage host range phenotype data using two different genome-wide association study (GWAS) pipelines, i.e., pyseer 1.2.0 (42) and treeWAS 1.0 (43). pyseer associated clusters of orthologous genes (COGs), core genome single nucleotide polymorphisms (SNPs), and k-mers with lengths between 6 and 610 bp with each phenotype, while treeWAS associated only biallelic core genome SNPs with the phenotype. treeWAS used the recombination-corrected core genome phylogeny for population structure correction, while pyseer used a conversion of the phylogeny into a kinship matrix. The core genome alignment was rearranged to set N315 as the reference (first sequence). We chose N315 as the reference because it was used as a global S. aureus reference for the Staphopia project (99). SNPs were called from the core genome alignment with Snp-sites (110). For identifying significantly associated genetic determinants, a Bonferroni correction of 0.05/6,058 or 8.25e−6 was set for COG GWAS, 0.05/15,557 or 3.21398e−6 for SNP GWAS, and 0.05/2,304,257 or 2.17e−8 for k-mer GWAS, counting the numbers of intermediate-frequency COGs, biallelic core genome SNPs, and unique k-mers, respectively, as hypotheses to be tested.
pyseer SNP and COG association analyses performed multidimensional scaling (MDS) on a Mash distance matrix between tested strains to correct for population structure. pyseer SNP association was performed with a fixed effect (for variant and covariate lineage) model, the default 10 multidimensional scaling (MDS) dimensions retained, and lineage effect testing on each quantitative phage resistance/host range phenotype for all biallelic core genome SNPs. pyseer COG association was performed with a fixed-effect model on each phenotype and nine MDS dimensions retained for intermediate frequency COGs (see Fig. S1 in the supplemental material). pyseer k-mer association was performed with a FaST-LMM linear mixed (combined fixed variant/covariate lineage and random kinship effects) model on each quantitative phenotype for unique k-mers between 6 and 610 bp in length extracted from genomes of all tested strains. pyseer k-mer association analyses used a kinship matrix between tested strains constructed from the core genome phylogeny to correct for population structure and set a minor allele frequency cutoff for analysis of 1%, like SNP and COG analyses. SNP and k-mer association P values were demonstrated relative to genetic coordinates using Manhattan plots (with phandango) (111). Associations for all k-mers were assessed for P value inflation (exceeding the observed/expected P value diagonal below 1e−2) using Q-Q plots (Fig. S2). Significant SNPs and k-mers were annotated using SnpEff (112) (relative to the Roary N315 core genome sequence) and downstream analysis scripts included with pyseer, respectively, identifying the genes containing the genetic elements (or near the genetic elements, in the case of k-mers) and mutation effects, in the case of SNPs.
treeWAS was performed for each phage resistance phenotype using the R package with core genome alignment, IQ-TREE core genome phylogeny, and quantitative phage resistance phenotype as inputs and with default parameters. Significant treeWAS SNPs were annotated using SnpEff (112) relative to the core genome sequence of strain N315 (113).
Functional annotation and network analysis of significantly associated genes.
Genes with significant association from the GWAS (containing SNPs and either near to or overlapping with k-mers) were then used to identify enriched protein functions or possible protein-protein interactions. Gene name lists for each phage were converted to NCTC 8325 RefSeq protein accession lists for use with STRING (47) and PANTHER (48), which depend on NCTC 8325 S. aureus accessions. To convert genes containing significant SNPs to NCTC 8325 accessions, Roary N315 core genes were aligned against NCTC 8325 RefSeq proteins with NCBI blastx (one maximum target sequence, one maximum high scoring pair, default e-value). Gene names matching NCTC 8325 RefSeq accessions were converted for each significant SNP using these alignment results. To convert genes containing significant k-mers to NCTC 8325 accessions, all significant genes were aligned against NCTC 8325 RefSeq proteins with blastx (one maximum target sequence, one maximum high scoring pair, default e-value). Gene names matching NCTC 8325 RefSeq accessions were converted for each significant k-mer using these alignment results. Any gene names not mapped to any NCTC 8325 RefSeq protein accessions after this procedure were left unchanged. Lists of significant genes for each phage, for all phage morphological classes (Siphoviridae, Myoviridae, and Podoviridae), and for each life cycle type (virulent or temperate) were used as inputs for STRING and PANTHER. STRING network properties (nodes, edges, average node degree, average local clustering coefficient, expected number of edges, and protein-protein interactions (PPI) enrichment P value) were saved for each input, while PANTHER functional classification and statistical overrepresentation test analyses were performed for each input with respect to molecular function, biological process, cellular component, protein class, and pathway.
Genetic validation of novel phage resistance mechanisms.
Six genes (trpA, phoR, isdB, sodM, fmtC, and relA) found to contain significantly associated SNPs or k-mers for any phage resistance phenotype were validated to cause phage resistance changes when knocked out in a single S. aureus genetic background (USA300 JE2). Transposon insertion mutants in each gene were selected from the Nebraska Transposon Mutant Library (NTML) (49) and backcrossed into USA300 JE2 through the transduction method previously described (91) to eliminate any possible secondary acquired mutations. Backcrossed mutants were then complemented with each gene cloned into the vector pOS1-Plgt (50). Relevant strains (selected mutants and complemented strains) are listed in Table 4. Growth curves were performed on all listed strains (Fig. S4). USA300 JE2, respective transposon mutants, empty vector controls, or complemented mutants were inoculated with a 96-pin replicator from arrayed frozen glycerol stocks into 96-well plates containing 200 μl LB-TSB 2:1 with 5 mM CaCl2 or the same medium supplemented with 10 μg/ml chloramphenicol in each well. We then diluted each culture 1:100 in fresh LB-TSB 2:1 with 5 mM CaCl2 or the same medium supplemented with 10 μg/ml chloramphenicol and collected growth curves on a BioTek Eon plate reader (37°C, 225 rpm agitation, OD600 measured every 10 min).
Genes were cloned into pOS1-Plgt either through splicing overlap extension (SOE) PCR (trpA, phoR, and sodM) or through NEB HiFi assembly (isdB, fmtC, and relA). Each gene and pOS1-Plgt were amplified with the primers listed in Table S3 at https://figshare.com/articles/dataset/Supplemental_Table_S3/13355939 to create overlap into the corresponding fragment by using NEB Q5 high-fidelity DNA polymerase according to the manufacturer’s directions. All genes were amplified from USA300 JE2 genomic DNA except for fmtC, which was amplified both from USA300 JE2 and NRS209. Genes were cloned into the same site downstream of the Plgt promoter. For SOE PCR, AMpure XP bead-purified gene and vector fragments were mixed together at a ratio of 1:59 and amplified for 20 cycles with NEB Q5 high-fidelity polymerase at an annealing temperature of 60°C. For HiFi assembly, purified gene and vector fragments were mixed together at a ratio of 1:2 (less than 0.2 pmol DNA total) and incubated with NEBuilder HiFi DNA assembly master mix for 3 h at 50°C. SOE PCR and HiFi assembly products were transformed into NEB DH5ɑ competent cells (high efficiency), plated on LB agar with ampicillin (100 μg/ml), and grown overnight at 37°C. Transformants were verified by colony PCR with the respective LF and RR primers listed in Table S3. Plasmids were extracted from verified transformant overnight cultures with the Promega PureYield plasmid miniprep system. These plasmids were then transformed into E. coli IM08B (114) to improve electroporation efficiency into the USA300 JE2 transposon mutants.
Electrocompetent S. aureus cells (USA300 JE2 transposon mutants) were prepared as previously described (115). S. aureus electrocompetent cells were electroporated with 2 μg of ethanol-precipitated plasmid DNA (empty vector and vector with insert corresponding to transposon insertion). Electrocompetent cells were first thawed, centrifuged, and resuspended in 50 μl 10% glycerol–0.5 M sucrose. After plasmid DNA was added, cells were transferred to 0.1-cm electroporation cuvettes and pulsed at 2.1 kV, 100 Ω, and 25 μF. Immediately after electroporation, 1 ml of TSB–0.5 M sucrose was added to the cuvette, and the culture was transferred to an Eppendorf tube to recover for 90 min at 37°C and 250 rpm. Dilutions of the outgrowth were plated on TSA with chloramphenicol (10 μg/ml) and grown overnight at 37°C. Electroporants were verified by colony PCR with the respective LF and RR primers listed in Table S3.
pOS1 fmtC and relA were introduced into USA300 JE2 transposon mutants, however, by transduction from RN4220. S. aureus RN4220 was electroporated with pOS1 fmtC (USA300), pOS1 fmtC (NRS209), and pOS1 relA plasmids according to the procedure described previously. Plasmids were then transduced from RN4220 to USA300 JE2 transposon mutants according to a previously published procedure (91). Briefly, a recipient strain was infected with donor phage at a multiplicity of infection (MOI) of 0.1 after supplementation with CaCl2. The infected culture was then outgrown in TSB supplemented with sodium citrate to prevent phage lysogeny. The outgrowth culture was plated on TSA supplemented with both chloramphenicol (10 μg/ml) and sodium citrate (40 mM) to select for plasmids and inhibit lysogeny, respectively.
Mutants and their complemented derivatives were assessed for phage resistance and host range both through the high-throughput assay described previously and the efficiency of plating (EOP) assay (77) to assess bacterial growth in the presence of phage and phage plaquing efficiency, respectively. The high-throughput host range assay was performed as described earlier, but overnight cultures of strains were grown in LB-TSB 2:1 with 5 mM CaCl2 supplemented with chloramphenicol (10 μg/ml) to maintain plasmid selection in the case of complemented strains for this and the EOP assay. The EOP assay was performed by spotting 4 μl of neat through 1e−8 dilutions of phages p0045, p0006, p0017, p0017S, p002y, p003p, p0040, and pyo on lawns (0.2 ml of a 1:10 overnight culture dilution mixed with 5 ml of top agar) of a test and a reference (USA300) strain. Lawns were poured on TSA plates. EOP was calculated by dividing the phage titer on the test strain by that on the reference strain.
Additional experiments with the trpA mutant set and phage p003p examined bacterial survival after performance of the phage-culture soft agar coincubation of the high-throughput assay. The high-throughput assay was performed as described earlier for six replicates of USA300, USA300 trpA::Tn, USA300 trpA::Tn pOS1, and USA300 trpA::Tn pOS1 trpA strains. The corresponding ODs were recorded as described for the high-throughput phage host range assay (Fig. S5A). Agar plugs were then removed with toothpicks, placed in 0.8-ml volumes of sterile TMG, and broken apart by vortexing. The resuspensions were then serially diluted in TMG, and 4-μl volumes of 1e−1 through 1e−6 dilutions were spotted four times on TSA plates. Dilution plates were grown overnight at 37°C, and colonies were counted the following day to determine the number of surviving CFU under each condition (Fig. S5B).
Construction of phage resistance phenotype predictive models.
Phage resistance predictive models were constructed using three methods, i.e. random (decision) forests, gradient-boosted decision trees, and neural networks. Random forests were generated using the randomForest R package, and gradient-boosted decision trees were generated with the XGBoost R package (116). Ternary (S, SS, or R) phenotypes converted from the original high-throughput assay quantitative phenotypes (described in “Phage resistance/host range assays”) were set as the response variable, while either the presence or absence of each significant genetic element, each k-mer, or one of the previous two sets (all elements or just k-mers) and both strain sequence type (ST) and clonal complex (CC) were set as predictor variables. Random forest and XGBoost predictive accuracy and receiver operating characteristic (ROC) area under the curve (AUC) were determined on the validation set through multiple replicates of 10-fold cross-validation, in which alternating tenths of data are used for validation while the model is trained on the remaining data. The optimal number of rounds (iterations) for XGBoost was determined for each phage and set of input predictor variables with 5-fold cross-validation. XGBoost model training also used the softmax objective for multiclass (three classes—S, SS, and R) classification.
Neural network model construction was more complicated, as it involved a preprocessing step to balance data sets where necessary. Oversampling or a combination of over- and undersampling methods was performed to balance specific data sets. For the oversampling method, new samples of the minority classes were randomly generated with replacement so that the number of samples for each class would be equal to that of the majority class in the original data set. For the combination method, the synthetic minority oversampling technique (SMOTE) for oversampling and Tomek links for undersampling were performed together. However, for phages with limited cases for one class type, such as p002y, we could not conduct undersampling. Therefore, for such data sets, only the oversampling method was performed. The new balanced data sets were then split into training and validation sets with 30% validation. Random splits were performed four times to generate four replicates for evaluation, each with different train and test data sets. Each replicate was evaluated as described before, with validation set prediction accuracy and ROC AUC.
Neural network models were constructed three ways: (i) with or without oversampling or with an oversampling-undersampling combination alone, (ii) as in (i) but with a regularizer and dropout layer, or (iii) as in (i) but with lasso regression for feature selection. All methods used ADAM (117) for optimizing and sparse categorical cross-entropy for loss. For imbalanced data sets, the oversampling and combination over- and undersampling methods were used as well, if possible. The fully connected neural network was constructed based on the selected, balanced data set. We then found both training and prediction accuracy to evaluate performance for each network model. We note that network models were optimized for each replicate training set, which means there may be different network models for the four replicates. In the first method, fully connected neural network models were constructed on data sets either originally balanced or balanced after oversampling/combination methods, with no further correction. Since some network models have high prediction accuracies, it is possible that these models are overfitting, so the second method adds a regularizer and a dropout layer to fully connected neural networks as new models. Finally, for some network models, the prediction accuracies were not as high as others. Thus, in the third method, lasso regression was performed to select important features and improve performance. A neural network model was constructed on the new data set based on these selected features.
Information entropy was compared to average randomForest and XGBoost 10-fold cross-validation and neural network predictive accuracies and ROC AUCs after calculation by using the following equation (118), where H is the total information entropy, Px(xi) is the probability of event xi, n is the number of possible events, and the three possible events are S, SS, and R phenotypes:
ACKNOWLEDGMENTS
We thank Veronique Perrot and Bruce Levin for providing the pyo myophage used for host range evaluation. We also thank Bruce Levin for providing constructive comments on the manuscript. Sarah Satola and Eryn Bernardy provided VISA and CF S. aureus strains used for host range testing, respectively. We thank Michelle Su and Robert Petit for assistance with GWAS methods and constructive criticism of the project.
Abraham G. Moller was supported by the National Science Foundation (NSF) Graduate Research Fellowship Program (GRFP). Timothy D. Read was supported by National Institutes of Health (NIH) grant no. AI121860. Kyle Winston was supported by an Emory REAL fellowship.
REFERENCES
- 1.Moller AG, Lindsay JA, Read TD. 2019. Determinants of phage host range in Staphylococcus species. Appl Environ Microbiol 85:e00209-19. doi: 10.1128/AEM.00209-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Azam AH, Tanji Y. 2019. Peculiarities of Staphylococcus aureus phages and their possible application in phage therapy. Appl Microbiol Biotechnol 103:4279–4289. doi: 10.1007/s00253-019-09810-2. [DOI] [PubMed] [Google Scholar]
- 3.Pirisi A. 2000. Phage therapy—advantages over antibiotics? Lancet 356:1418. doi: 10.1016/S0140-6736(05)74059-9. [DOI] [PubMed] [Google Scholar]
- 4.Nobrega FL, Costa AR, Kluskens LD, Azeredo J. 2015. Revisiting phage therapy: new applications for old resources. Trends Microbiol 23:185–191. doi: 10.1016/j.tim.2015.01.006. [DOI] [PubMed] [Google Scholar]
- 5.Xia G, Wolz C. 2014. Phages of Staphylococcus aureus and their impact on host evolution. Infect Genet Evol 21:593–601. doi: 10.1016/j.meegid.2013.04.022. [DOI] [PubMed] [Google Scholar]
- 6.Deghorain M, Van Melderen L. 2012. The staphylococci phages family: an overview. Viruses 4:3316–3335. doi: 10.3390/v4123316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Azam AH, Hoshiga F, Takeuchi I, Miyanaga K, Tanji Y. 2018. Analysis of phage resistance in Staphylococcus aureus SA003 reveals different binding mechanisms for the closely related Twort-like phages ΦSA012 and ΦSA039. Appl Microbiol Biotechnol 102:8963–8977. doi: 10.1007/s00253-018-9269-x. [DOI] [PubMed] [Google Scholar]
- 8.Takeuchi I, Osada K, Azam AH, Asakawa H, Miyanaga K, Tanji Y. 2016. The presence of two receptor-binding proteins contributes to the wide host range of staphylococcal Twort-like phages. Appl Environ Microbiol 82:5763–5774. doi: 10.1128/AEM.01385-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Xia G, Kohler T, Peschel A. 2010. The wall teichoic acid and lipoteichoic acid polymers of Staphylococcus aureus. Int J Med Microbiol 300:148–154. doi: 10.1016/j.ijmm.2009.10.001. [DOI] [PubMed] [Google Scholar]
- 10.Winstel V, Liang C, Sanchez-Carballo P, Steglich M, Munar M, Bröker BM, Penadés JR, Nübel U, Holst O, Dandekar T, Peschel A, Xia G. 2013. Wall teichoic acid structure governs horizontal gene transfer between major bacterial pathogens. Nat Commun 4:2345. doi: 10.1038/ncomms3345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Li X, Gerlach D, Du X, Larsen J, Stegger M, Kühner P, Peschel A, Xia G, Winstel V. 2015. An accessory wall teichoic acid glycosyltransferase protects Staphylococcus aureus from the lytic activity of Podoviridae. Sci Rep 5:17219. doi: 10.1038/srep17219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wilkinson BJ, Holmes KM. 1979. Staphylococcus aureus cell surface: capsule as a barrier to bacteriophage adsorption. Infect Immun 23:549–552. doi: 10.1128/IAI.23.2.549-552.1979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Nordström K, Forsgren A. 1974. Effect of protein A on adsorption of bacteriophages to Staphylococcus aureus. J Virol 14:198–202. doi: 10.1128/JVI.14.2.198-202.1974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Gerlach D, Guo Y, Castro CD, Kim S-H, Schlatterer K, Xu F-F, Pereira C, Seeberger PH, Ali S, Codée J, Sirisarn W, Schulte B, Wolz C, Larsen J, Molinaro A, Lee BL, Xia G, Stehle T, Peschel A. 2018. Methicillin-resistant Staphylococcus aureus alters cell wall glycosylation to evade immunity. Nature 563:705–709. doi: 10.1038/s41586-018-0730-x. [DOI] [PubMed] [Google Scholar]
- 15.Xia G, Corrigan RM, Winstel V, Goerke C, Gründling A, Peschel A. 2011. Wall teichoic acid-dependent adsorption of staphylococcal siphovirus and myovirus. J Bacteriol 193:4006–4009. doi: 10.1128/JB.01412-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chatterjee AN. 1969. Use of bacteriophage-resistant mutants to study the nature of the bacteriophage receptor site of Staphylococcus aureus. J Bacteriol 98:519–527. doi: 10.1128/JB.98.2.519-527.1969. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Depardieu F, Didier J-P, Bernheim A, Sherlock A, Molina H, Duclos B, Bikard D. 2016. A eukaryotic-like serine/threonine kinase protects staphylococci against phages. Cell Host Microbe 20:471–481. doi: 10.1016/j.chom.2016.08.010. [DOI] [PubMed] [Google Scholar]
- 18.Kinnevey PM, Shore AC, Brennan GI, Sullivan DJ, Ehricht R, Monecke S, Slickers P, Coleman DC. 2013. Emergence of sequence type 779 methicillin-resistant Staphylococcus aureus harboring a novel pseudo staphylococcal cassette chromosome mec (SCCmec)-SCC-SCCCRISPR composite element in Irish hospitals. Antimicrob Agents Chemother 57:524–531. doi: 10.1128/AAC.01689-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Cao L, Gao C-H, Zhu J, Zhao L, Wu Q, Li M, Sun B. 2016. Identification and functional study of type III-A CRISPR-Cas systems in clinical isolates of Staphylococcus aureus. Int J Med Microbiol 306:686–696. doi: 10.1016/j.ijmm.2016.08.005. [DOI] [PubMed] [Google Scholar]
- 20.Yang S, Liu J, Shao F, Wang P, Duan G, Yang H. 2015. Analysis of the features of 45 identified CRISPR loci in 32 Staphylococcus aureus. Biochem Biophys Res Commun 464:894–900. doi: 10.1016/j.bbrc.2015.07.062. [DOI] [PubMed] [Google Scholar]
- 21.Zhao X, Yu Z, Xu Z. 2018. Study the features of 57 confirmed CRISPR loci in 38 strains of Staphylococcus aureus. Front Microbiol 9:1591. . doi: 10.3389/fmicb.2018.01591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Damle PK, Wall EA, Spilman MS, Dearborn AD, Ram G, Novick RP, Dokland T, Christie GE. 2012. The roles of SaPI1 proteins gp7 (CpmA) and gp6 (CpmB) in capsid size determination and helper phage interference. Virology 432:277–282. doi: 10.1016/j.virol.2012.05.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ram G, Chen J, Kumar K, Ross HF, Ubeda C, Damle PK, Lane KD, Penadés JR, Christie GE, Novick RP. 2012. Staphylococcal pathogenicity island interference with helper phage reproduction is a paradigm of molecular parasitism. Proc Natl Acad Sci U S A 109:16300–16305. doi: 10.1073/pnas.1204615109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Christie GE, Dokland T. 2012. Pirates of the Caudovirales. Virology 434:210–221. doi: 10.1016/j.virol.2012.10.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Ram G, Chen J, Ross HF, Novick RP. 2014. Precisely modulated pathogenicity island interference with late phage gene transcription. Proc Natl Acad Sci U S A 111:14536–14541. doi: 10.1073/pnas.1406749111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Novick RP, Christie GE, Penadés JR. 2010. The phage-related chromosomal islands of Gram-positive bacteria. Nat Rev Microbiol 8:541–551. doi: 10.1038/nrmicro2393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Poliakov A, Chang JR, Spilman MS, Damle PK, Christie GE, Mobley JA, Dokland T. 2008. Capsid size determination by Staphylococcus aureus pathogenicity island SaPI1 involves specific incorporation of SaPI1 proteins into procapsids. J Mol Biol 380:465–475. doi: 10.1016/j.jmb.2008.04.065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Hsieh S-E, Lo H-H, Chen S-T, Lee M-C, Tseng Y-H. 2011. Wide host range and strong lytic activity of Staphylococcus aureus lytic phage Stau2. Appl Environ Microbiol 77:756–761. doi: 10.1128/AEM.01848-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Synnott AJ, Kuang Y, Kurimoto M, Yamamichi K, Iwano H, Tanji Y. 2009. Isolation from sewage influent and characterization of novel Staphylococcus aureus bacteriophages with wide host ranges and potent lytic capabilities. Appl Environ Microbiol 75:4483–4490. doi: 10.1128/AEM.02641-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Alves DR, Gaudion A, Bean JE, Esteban PP, Arnot TC, Harper DR, Kot W, Hansen LH, Enright MC, Jenkins ATA. 2014. Combined use of bacteriophage K and a novel bacteriophage to reduce Staphylococcus aureus biofilm formation. Appl Environ Microbiol 80:6694–6703. doi: 10.1128/AEM.01789-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.O'Flaherty S, Ross RP, Meaney W, Fitzgerald GF, Elbreki MF, Coffey A. 2005. Potential of the polyvalent anti-Staphylococcus bacteriophage K for control of antibiotic-resistant staphylococci from hospitals. Appl Environ Microbiol 71:1836–1842. doi: 10.1128/AEM.71.4.1836-1842.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Abatángelo V, Peressutti Bacci N, Boncompain CA, Amadio AF, Carrasco S, Suárez CA, Morbidoni HR. 2017. Broad-range lytic bacteriophages that kill Staphylococcus aureus local field strains. PLoS One 12:e0181671. doi: 10.1371/journal.pone.0181671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Iwano H, Inoue Y, Takasago T, Kobayashi H, Furusawa T, Taniguchi K, Fujiki J, Yokota H, Usui M, Tanji Y, Hagiwara K, Higuchi H, Tamura Y. 2018. Bacteriophage ΦSA012 has a broad host range against Staphylococcus aureus and effective lytic capacity in a mouse mastitis model. Biology (Basel) 7:8. doi: 10.3390/biology7010008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Peng C, Hanawa T, Azam AH, LeBlanc C, Ung P, Matsuda T, Onishi H, Miyanaga K, Tanji Y. 2019. Silviavirus phage ΦMR003 displays a broad host range against methicillin-resistant Staphylococcus aureus of human origin. Appl Microbiol Biotechnol 103:7751–7765. doi: 10.1007/s00253-019-10039-2. [DOI] [PubMed] [Google Scholar]
- 35.Eaton MD, Bayne-Jones S. 1934. Bacteriophage therapy: review of the principles and results of the use of bacteriophage in the treatment of infections. JAMA 103:1847–1853. doi: 10.1001/jama.1934.72750500003009. [DOI] [Google Scholar]
- 36.Zschach H, Larsen MV, Hasman H, Westh H, Nielsen M, Międzybrodzki R, Jończyk-Matysiak E, Weber-Dąbrowska B, Górski A. 2018. Use of a regression model to study host-genomic determinants of phage susceptibility in MRSA. Antibiotics (Basel) 7:9. doi: 10.3390/antibiotics7010009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Waldron DE, Lindsay JA. 2006. Sau1: a novel lineage-specific type I restriction-modification system that blocks horizontal gene transfer into Staphylococcus aureus and between S. aureus isolates of different lineages. J Bacteriol 188:5578–5585. doi: 10.1128/JB.00418-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Roberts GA, Houston PJ, White JH, Chen K, Stephanou AS, Cooper LP, Dryden DTF, Lindsay JA. 2013. Impact of target site distribution for type I restriction enzymes on the evolution of methicillin-resistant Staphylococcus aureus (MRSA) populations. Nucleic Acids Res 41:7472–7484. doi: 10.1093/nar/gkt535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Brown S, Xia G, Luhachack LG, Campbell J, Meredith TC, Chen C, Winstel V, Gekeler C, Irazoqui JE, Peschel A, Walker S. 2012. Methicillin resistance in Staphylococcus aureus requires glycosylated wall teichoic acids. Proc Natl Acad Sci U S A 109:18909–18914. doi: 10.1073/pnas.1209126109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wang H, Gill CJ, Lee SH, Mann P, Zuck P, Meredith TC, Murgolo N, She X, Kales S, Liang L, Liu J, Wu J, Santa Maria J, Su J, Pan J, Hailey J, Mcguinness D, Tan CM, Flattery A, Walker S, Black T, Roemer T. 2013. Discovery of wall teichoic acid inhibitors as potential anti-MRSA β-lactam combination agents. Chem Biol 20:272–284. doi: 10.1016/j.chembiol.2012.11.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Münkemüller T, Lavergne S, Bzeznik B, Dray S, Jombart T, Schiffers K, Thuiller W. 2012. How to measure and test phylogenetic signal. Methods Ecol Evol 3:743–756. doi: 10.1111/j.2041-210X.2012.00196.x. [DOI] [Google Scholar]
- 42.Lees JA, Vehkala M, Välimäki N, Harris SR, Chewapreecha C, Croucher NJ, Marttinen P, Davies MR, Steer AC, Tong SYC, Honkela A, Parkhill J, Bentley SD, Corander J. 2016. Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes. Nat Commun 7:12797. doi: 10.1038/ncomms12797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Collins C, Didelot X. 2018. A phylogenetic method to perform genome-wide association studies in microbes that accounts for population structure and recombination. PLoS Comput Biol 14:e1005958. doi: 10.1371/journal.pcbi.1005958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Qian Z, Yin Y, Zhang Y, Lu L, Li Y, Jiang Y. 2006. Genomic characterization of ribitol teichoic acid synthesis in Staphylococcus aureus: genes, genomic organization and gene duplication. BMC Genomics 7:74. doi: 10.1186/1471-2164-7-74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Shaw DRD, Chatterjee AN. 1971. O-acetyl groups as a component of the bacteriophage receptor on Staphylococcus aureus cell walls. J Bacteriol 108:584–585. doi: 10.1128/JB.108.1.584-585.1971. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Berngruber TW, Weissing FJ, Gandon S. 2010. Inhibition of superinfection and the evolution of viral latency. J Virol 84:10200–10208. doi: 10.1128/JVI.00865-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, von Mering C. 2015. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 43:D447–D452. doi: 10.1093/nar/gku1003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Mi H, Huang X, Muruganujan A, Tang H, Mills C, Kang D, Thomas PD. 2017. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Res 45:D183–D189. doi: 10.1093/nar/gkw1138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Fey PD, Endres JL, Yajjala VK, Widhelm TJ, Boissy RJ, Bose JL, Bayles KW. 2013. A genetic resource for rapid and comprehensive phenotype screening of nonessential Staphylococcus aureus genes. mBio 4:e00537-12. doi: 10.1128/mBio.00537-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Wardenburg JB, Williams WA, Missiakas D. 2006. Host defenses against Staphylococcus aureus infection require recognition of bacterial lipoproteins. Proc Natl Acad Sci U S A 103:13831–13836. doi: 10.1073/pnas.0603072103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Proctor AR, Kloos WE. 1973. Tryptophan biosynthetic enzymes of Staphylococcus aureus. J Bacteriol 114:169–177. doi: 10.1128/JB.114.1.169-177.1973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kaur S, Harjai K, Chhibber S. 2012. Methicillin-resistant Staphylococcus aureus phage plaque size enhancement using sublethal concentrations of antibiotics. Appl Environ Microbiol 78:8227–8233. doi: 10.1128/AEM.02371-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Kelliher JL, Radin JN, Kehl-Fie TE. 2018. PhoPR contributes to Staphylococcus aureus growth during phosphate starvation and pathogenesis in an environment-specific manner. Infect Immun 86:e00371-18. doi: 10.1128/IAI.00371-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Botella E, Devine SK, Hubner S, Salzberg LI, Gale RT, Brown ED, Link H, Sauer U, Codée JD, Noone D, Devine KM. 2014. PhoR autokinase activity is controlled by an intermediate in wall teichoic acid metabolism that is sensed by the intracellular PAS domain during the PhoPR-mediated phosphate limitation response of Bacillus subtilis. Mol Microbiol 94:1242–1259. doi: 10.1111/mmi.12833. [DOI] [PubMed] [Google Scholar]
- 55.Liu W, Eder S, Hulett FM. 1998. Analysis of Bacillus subtilis tagAB and tagDEF expression during phosphate starvation identifies a repressor role for PhoP∼P. J Bacteriol 180:753–758. doi: 10.1128/JB.180.3.753-758.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Myers CL, Li FKK, Koo B-M, El-Halfawy OM, French S, Gross CA, Strynadka NCJ, Brown ED. 2016. Identification of two phosphate starvation-induced wall teichoic acid hydrolases provides first insights into the degradative pathway of a key bacterial cell wall component. J Biol Chem 291:26066–26082. doi: 10.1074/jbc.M116.760447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Jorge AM, Schneider J, Unsleber S, Xia G, Mayer C, Peschel A. 2018. Staphylococcus aureus counters phosphate limitation by scavenging wall teichoic acids from other staphylococci via the teichoicase GlpQ. J Biol Chem 293:14916–14924. doi: 10.1074/jbc.RA118.004584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Ladjouzi R, Bizzini A, Lebreton F, Sauvageot N, Rincé A, Benachour A, Hartke A. 2013. Analysis of the tolerance of pathogenic enterococci and Staphylococcus aureus to cell wall active antibiotics. J Antimicrob Chemother 68:2083–2091. doi: 10.1093/jac/dkt157. [DOI] [PubMed] [Google Scholar]
- 59.Ladjouzi R, Bizzini A, van Schaik W, Zhang X, Rincé A, Benachour A, Hartke A. 2015. Loss of antibiotic tolerance in Sod-deficient mutants is dependent on the energy source and arginine catabolism in enterococci. J Bacteriol 197:3283–3293. doi: 10.1128/JB.00389-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Sacher JC, Flint A, Butcher J, Blasdel B, Reynolds HM, Lavigne R, Stintzi A, Szymanski CM. 2018. Transcriptomic analysis of the Campylobacter jejuni response to T4-like phage NCTC 12673 infection. Viruses 10:332. doi: 10.3390/v10060332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Chatterjee A, Willett JLE, Nguyen UT, Monogue B, Palmer KL, Dunny GM, Duerkop BA. 2020. Parallel genomics uncover novel enterococcal-bacteriophage interactions. mBio 11:e03120-19. doi: 10.1128/mBio.03120-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Peschel A, Jack RW, Otto M, Collins LV, Staubitz P, Nicholson G, Kalbacher H, Nieuwenhuizen WF, Jung G, Tarkowski A, van Kessel KPM, van Strijp JAG. 2001. Staphylococcus aureus resistance to human defensins and evasion of neutrophil killing via the novel virulence factor MprF is based on modification of membrane lipids with l-lysine. J Exp Med 193:1067–1076. doi: 10.1084/jem.193.9.1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Bayer AS, Schneider T, Sahl H-G. 2013. Mechanisms of daptomycin resistance in Staphylococcus aureus: role of the cell membrane and cell wall. Ann N Y Acad Sci 1277:139–158. doi: 10.1111/j.1749-6632.2012.06819.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Andrä J, Goldmann T, Ernst CM, Peschel A, Gutsmann T. 2011. Multiple peptide resistance factor (MprF)-mediated resistance of Staphylococcus aureus against antimicrobial peptides coincides with a modulated peptide interaction with artificial membranes comprising lysyl-phosphatidylglycerol. J Biol Chem 286:18692–18700. doi: 10.1074/jbc.M111.226886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Young R. 2014. Phage lysis: three steps, three choices, one outcome. J Microbiol 52:243–258. doi: 10.1007/s12275-014-4087-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Cahill J, Young R. 2019. Phage lysis: multiple genes for multiple barriers, p 33–70. In Kielian M, Mettenleiter TC, Roossinck MJ (ed), Advances in virus research. Academic Press, New York, NY. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Catalão MJ, Gil F, Moniz-Pereira J, São-José C, Pimentel M. 2013. Diversity in bacterial lysis systems: bacteriophages show the way. FEMS Microbiol Rev 37:554–571. doi: 10.1111/1574-6976.12006. [DOI] [PubMed] [Google Scholar]
- 68.Geiger T, Goerke C, Fritz M, Schäfer T, Ohlsen K, Liebeke M, Lalk M, Wolz C. 2010. Role of the (p)ppGpp synthase RSH, a RelA/SpoT homolog, in stringent response and virulence of Staphylococcus aureus. Infect Immun 78:1873–1883. doi: 10.1128/IAI.01439-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Fernández L, González S, Campelo AB, Martínez B, Rodríguez A, García P. 2017. Low-level predation by lytic phage phiIPLA-RODI promotes biofilm formation and triggers the stringent response in Staphylococcus aureus. Sci Rep 7:40965. doi: 10.1038/srep40965. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Gaca AO, Colomer-Winter C, Lemos JA. 2015. Many means to a common end: the intricacies of (p)ppGpp metabolism and its control of bacterial homeostasis. J Bacteriol 197:1146–1156. doi: 10.1128/JB.02577-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Conlon BP, Rowe SE, Gandt AB, Nuxoll AS, Donegan NP, Zalis EA, Clair G, Adkins JN, Cheung AL, Lewis K. 2016. Persister formation in Staphylococcus aureus is associated with ATP depletion. Nat Microbiol 1:16051. doi: 10.1038/nmicrobiol.2016.51. [DOI] [PubMed] [Google Scholar]
- 72.Torres VJ, Pishchany G, Humayun M, Schneewind O, Skaar EP. 2006. Staphylococcus aureus IsdB is a hemoglobin receptor required for heme iron utilization. J Bacteriol 188:8421–8429. doi: 10.1128/JB.01335-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.O'Riordan K, Lee JC. 2004. Staphylococcus aureus capsular polysaccharides. Clin Microbiol Rev 17:218–234. doi: 10.1128/cmr.17.1.218-234.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Lindsay JA, Ruzin A, Ross HF, Kurepina N, Novick RP. 1998. The gene for toxic shock toxin is carried by a family of mobile pathogenicity islands in Staphylococcus aureus. Mol Microbiol 29:527–543. doi: 10.1046/j.1365-2958.1998.00947.x. [DOI] [PubMed] [Google Scholar]
- 75.Su M, Satola SW, Read TD. 2019. Genome-based prediction of bacterial antibiotic resistance. J Clin Microbiol 57:e01405-18. doi: 10.1128/JCM.01405-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Goerke C, Pantucek R, Holtfreter S, Schulte B, Zink M, Grumann D, Bröker BM, Doskar J, Wolz C. 2009. Diversity of prophages in dominant Staphylococcus aureus clonal lineages. J Bacteriol 191:3462–3468. doi: 10.1128/JB.01804-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Hyman P, Abedon ST. 2010. Bacteriophage host range and bacterial resistance. Adv Appl Microbiol 70:217–248. doi: 10.1016/S0065-2164(10)70007-1. [DOI] [PubMed] [Google Scholar]
- 78.Power RA, Parkhill J, de Oliveira T. 2017. Microbial genome-wide association studies: lessons from human GWAS. 1. Nat Rev Genet 18:41–50. doi: 10.1038/nrg.2016.132. [DOI] [PubMed] [Google Scholar]
- 79.Read TD, Massey RC. 2014. Characterizing the genetic basis of bacterial phenotypes using genome-wide association studies: a new direction for bacteriology. Genome Med 6:109. doi: 10.1186/s13073-014-0109-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Mutalik VK, Adler BA, Rishi HS, Piya D, Zhong C, Koskella B, Kutter EM, Calendar R, Novichkov PS, Price MN, Deutschbauer AM, Arkin AP. 2020. High-throughput mapping of the phage resistance landscape in E. coli. PLoS Biol 18:e3000877. doi: 10.1371/journal.pbio.3000877. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Dickey J, Perrot V. 2019. Adjunct phage treatment enhances the effectiveness of low antibiotic concentration against Staphylococcus aureus biofilms in vitro. PLoS One 14:e0209390. doi: 10.1371/journal.pone.0209390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Weidenmaier C, Kokai-Kun JF, Kristian SA, Chanturiya T, Kalbacher H, Gross M, Nicholson G, Neumeister B, Mond JJ, Peschel A. 2004. Role of teichoic acids in Staphylococcus aureus nasal colonization, a major risk factor in nosocomial infections. Nat Med 10:243–245. doi: 10.1038/nm991. [DOI] [PubMed] [Google Scholar]
- 83.Bera A, Biswas R, Herbert S, Kulauzovic E, Weidenmaier C, Peschel A, Götz F. 2007. Influence of wall teichoic acid on lysozyme resistance in Staphylococcus aureus. J Bacteriol 189:280–283. doi: 10.1128/JB.01221-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Kohler T, Weidenmaier C, Peschel A. 2009. Wall teichoic acid protects Staphylococcus aureus against antimicrobial fatty acids from human skin. J Bacteriol 191:4482–4484. doi: 10.1128/JB.00221-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Peschel A, Otto M, Jack RW, Kalbacher H, Jung G, Götz F. 1999. Inactivation of the dlt operon in Staphylococcus aureus confers sensitivity to defensins, protegrins, and other antimicrobial peptides. J Biol Chem 274:8405–8410. doi: 10.1074/jbc.274.13.8405. [DOI] [PubMed] [Google Scholar]
- 86.van Dalen R, Diaz J, Rumpret M, Fuchsberger FF, van Teijlingen NH, Hanske J, Rademacher C, Geijtenbeek TBH, van Strijp JAG, Weidenmaier C, Peschel A, Kaplan DH, van Sorge NM. 2019. Langerhans cells sense Staphylococcus aureus wall teichoic acid through langerin to induce inflammatory responses. mBio 10:e00330-19. doi: 10.1128/mBio.00330-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Wanner S, Schade J, Keinhörster D, Weller N, George SE, Kull L, Bauer J, Grau T, Winstel V, Stoy H, Kretschmer D, Kolata J, Wolz C, Bröker BM, Weidenmaier C. 2017. Wall teichoic acids mediate increased virulence in Staphylococcus aureus. Nat Microbiol 2:16257. doi: 10.1038/nmicrobiol.2016.257. [DOI] [PubMed] [Google Scholar]
- 88.Lindsay JA. 2014. Staphylococcus aureus genomics and the impact of horizontal gene transfer. Int J Med Microbiol 304:103–109. doi: 10.1016/j.ijmm.2013.11.010. [DOI] [PubMed] [Google Scholar]
- 89.Everitt RG, Didelot X, Batty EM, Miller RR, Knox K, Young BC, Bowden R, Auton A, Votintseva A, Larner-Svensson H, Charlesworth J, Golubchik T, Ip CLC, Godwin H, Fung R, Peto TEA, Walker AS, Crook DW, Wilson DJ. 2014. Mobile elements drive recombination hotspots in the core genome of Staphylococcus aureus. Nat Commun 5:3956. doi: 10.1038/ncomms4956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Narita S, Kaneko J, Chiba J, Piémont Y, Jarraud S, Etienne J, Kamio Y. 2001. Phage conversion of Panton-Valentine leukocidin in Staphylococcus aureus: molecular analysis of a PVL-converting phage, φSLT. Gene 268:195–206. doi: 10.1016/s0378-1119(01)00390-0. [DOI] [PubMed] [Google Scholar]
- 91.Krausz KL, Bose JL. 2016. Bacteriophage transduction in Staphylococcus aureus: broth-based method. Methods Mol Biol 1373:63–68. doi: 10.1007/7651_2014_185. [DOI] [PubMed] [Google Scholar]
- 92.Su M, Lyles JT, Iii RAP, Peterson J, Hargita M, Tang H, Solis-Lemus C, Quave CL, Read TD. 2020. Genomic analysis of variability in Delta-toxin levels between Staphylococcus aureus strains. PeerJ 8:e8717. doi: 10.7717/peerj.8717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Alam MT, Petit RA, Crispell EK, Thornton TA, Conneely KN, Jiang Y, Satola SW, Read TD. 2014. Dissecting vancomycin-intermediate resistance in Staphylococcus aureus using genome-wide association. Genome Biol Evol 6:1174–1185. doi: 10.1093/gbe/evu092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Bernardy EE, Petit RA, Moller AG, Blumenthal JA, McAdam AJ, Priebe GP, Chande AT, Rishishwar L, Jordan IK, Read TD, Goldberg JB. 2019. Whole-genome sequences of Staphylococcus aureus isolates from cystic fibrosis lung infections. Microbiol Resour Announc 8:e01564-18. doi: 10.1128/MRA.01564-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Wickham H. 2016. ggplot2: elegant graphics for data analysis. Springer, New York, NY. [Google Scholar]
- 96.Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722–736. doi: 10.1101/gr.215087.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. 2018. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun 9:5114. doi: 10.1038/s41467-018-07641-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Petit RA III, Read TD. 2018. Staphylococcus aureus viewed from the perspective of 40,000+ genomes. PeerJ 6:e5261. doi: 10.7717/peerj.5261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068–2069. doi: 10.1093/bioinformatics/btu153. [DOI] [PubMed] [Google Scholar]
- 102.Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S, Holden MTG, Fookes M, Falush D, Keane JA, Parkhill J. 2015. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 31:3691–3693. doi: 10.1093/bioinformatics/btv421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Croucher NJ, Page AJ, Connor TR, Delaney AJ, Keane JA, Bentley SD, Parkhill J, Harris SR. 2015. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res 43:e15. doi: 10.1093/nar/gku1196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274. doi: 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Seemann T. 2020. tseemann/mlst. Perl: https://github.com/tseemann/mlst. [Google Scholar]
- 106.Jolley KA, Bray JE, Maiden MCJ. 2018. Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications. Wellcome Open Res 3:124. doi: 10.12688/wellcomeopenres.14826.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Letunic I, Bork P. 2019. Interactive Tree Of Life (iTOL) v4: recent updates and new developments. Nucleic Acids Res 47:W256–W259. doi: 10.1093/nar/gkz239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Jombart T, Balloux F, Dray S. 2010. adephylo: new tools for investigating the phylogenetic signal in biological traits. Bioinformatics 26:1907–1909. doi: 10.1093/bioinformatics/btq292. [DOI] [PubMed] [Google Scholar]
- 109.Revell LJ. 2012. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol Evol 3:217–223. doi: 10.1111/j.2041-210X.2011.00169.x. [DOI] [Google Scholar]
- 110.Page AJ, Taylor B, Delaney AJ, Soares J, Seemann T, Keane JA, Harris SR. 2016. SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments. Microb Genom 2:e000056. doi: 10.1099/mgen.0.000056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Hadfield J, Croucher NJ, Goater RJ, Abudahab K, Aanensen DM, Harris SR. 2018. Phandango: an interactive viewer for bacterial population genomics. Bioinformatics 34:292–293. doi: 10.1093/bioinformatics/btx610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. 2012. A program for annotating and predicting the effects of single nucleotide polymorphisms. Fly (Austin) 6:80–92. doi: 10.4161/fly.19695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Kuroda M, Ohta T, Uchiyama I, Baba T, Yuzawa H, Kobayashi I, Cui L, Oguchi A, Aoki K, Nagai Y, Lian J, Ito T, Kanamori M, Matsumaru H, Maruyama A, Murakami H, Hosoyama A, Mizutani-Ui Y, Takahashi NK, Sawano T, Inoue R, Kaito C, Sekimizu K, Hirakawa H, Kuhara S, Goto S, Yabuzaki J, Kanehisa M, Yamashita A, Oshima K, Furuya K, Yoshino C, Shiba T, Hattori M, Ogasawara N, Hayashi H, Hiramatsu K. 2001. Whole genome sequencing of meticillin-resistant Staphylococcus aureus. Lancet 357:1225–1240. doi: 10.1016/s0140-6736(00)04403-2. [DOI] [PubMed] [Google Scholar]
- 114.Monk IR, Tree JJ, Howden BP, Stinear TP, Foster TJ. 2015. Complete bypass of restriction systems for major Staphylococcus aureus lineages. mBio 6:e00308-15. doi: 10.1128/mBio.00308-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Löfblom J, Kronqvist N, Uhlén M, Ståhl S, Wernérus H. 2007. Optimization of electroporation-mediated transformation: Staphylococcus carnosus as model organism. J Appl Microbiol 102:736–747. doi: 10.1111/j.1365-2672.2006.03127.x. [DOI] [PubMed] [Google Scholar]
- 116.Chen T, He T. 2020. xgboost: eXtreme gradient boosting 4. http://cran.fhcrc.org/web/packages/xgboost/vignettes/xgboost.pdf. [Google Scholar]
- 117.Kingma DP, Ba J. 2017. Adam: a method for stochastic optimization. arXiv 14126980 [cs] https://arxiv.org/abs/1412.6980.
- 118.Shannon CE. 1948. A mathematical theory of communication. Bell Syst Technical J 27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]
- 119.Grant SG, Jessee J, Bloom FR, Hanahan D. 1990. Differential plasmid rescue from transgenic mouse DNAs into Escherichia coli methylation-restriction mutants. Proc Natl Acad Sci U S A 87:4645–4649. doi: 10.1073/pnas.87.12.4645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Nair D, Memmi G, Hernandez D, Bard J, Beaume M, Gill S, Francois P, Cheung AL. 2011. Whole-genome sequencing of Staphylococcus aureus strain RN4220, a key laboratory strain used in virulence research, identifies mutations that affect not only virulence factors but also the fitness of the strain. J Bacteriol 193:2332–2335. doi: 10.1128/JB.00027-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Christie GE, Matthews AM, King DG, Lane KD, Olivarez NP, Tallent SM, Gill SR, Novick RP. 2010. The complete genomes of Staphylococcus aureus bacteriophages 80 and 80α—implications for the specificity of SaPI mobilization. Virology 407:381–390. doi: 10.1016/j.virol.2010.08.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Ajuebor J, Buttimer C, Arroyo-Moreno S, Chanishvili N, Gabriel EM, O’Mahony J, McAuliffe O, Neve H, Franz C, Coffey A. 2018. Comparison of Staphylococcus phage K with close phage relatives commonly employed in phage therapeutics. Antibiotics 7:37. doi: 10.3390/antibiotics7020037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Berryhill BA, McCall IC, Huseby DL, Hughes D, Levin B. 2020. Joint antibiotic and phage therapy: addressing the limitations of a seemingly ideal phage for treating Staphylococcus aureus infections. bioRxiv doi: 10.1101/2020.07.24.218685. [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Screen plot used to pick the number of dimensions for multidimensional scaling (MDS) in pyseer COG significance analysis. The number of dimensions (PCs) picked was the least possible (42), after which the eigenvalue stabilized with respect to dimension number. Download FIG S1, TIF file, 0.2 MB (223.4KB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
pyseer k-mer Q-Q plots for each phage (p0045, p0006, p0017, p0017S, p002y, p003p, p0040, and pyo). The observed P values were plotted relative to the expected P values based on the null distribution. Expected P values were plotted with a 95% confidence interval on the diagonal. Deviation of the observed/expected curve from the diagonal indicated P value inflation. Download FIG S2, TIF file, 0.9 MB (880.7KB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
GWAS approach and significant SNP annotations. (A) Overview of the genome-wide association study (GWAS) workflow. pyseer (42) associated intermediate-frequency COGs, core genome SNPs, and k-mers with each host range phenotype, while treeWAS (43) only associated core genome SNPs with each host range phenotype. SnpEff (112) classified mutation effects (synonymous, missense, or nonsense) from the corresponding Roary (102) gene sequence, while STRING (47) identified putative protein-protein interactions and PANTHER (48) identified enriched functions from lists of genes corresponding to each significant SNP or k-mer. (B) Classification of significantly associated pyseer or treeWAS SNPs based on mutational effect (synonymous, missense, or nonsense). SnpEff annotated SNP effects based on corresponding genes identified in the tested strains’ core genome with Roary. Phage 0045 was not included, as no significant SNPs were detected for its host range phenotype. Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple. Download FIG S3, TIF file, 0.8 MB (782.1KB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Growth curves of USA300, USA300 transposon mutants (A), transposon mutants electroporated with the empty pOS1 vector (B), and transposon mutants complemented with vectors containing the respective genes (C) (trpA, phoR, isdB, sodM, fmtC, and relA). Strains were inoculated with a 96-pin replicator from arrayed frozen glycerol stocks into 96-well plates containing 200 μl LB-TSB 2:1 with 5 mM CaCl2 or the same medium supplemented with 10 μg/ml chloramphenicol in each well. We then diluted each culture 1:100 in fresh LB-TSB 2:1 with 5 mM CaCl2 or the same medium supplemented with 10 μg/ml chloramphenicol and collected growth curves on a BioTek Eon plate reader (37°C, 225 rpm agitation, OD600 measured every 10 min). Download FIG S4, TIF file, 1.4 MB (1.4MB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Bacterial survival after completion of the high-throughput host range assay (p003p against trpA strains). The high-throughput assay was performed for six biological replicates of USA300, USA300 trpA::Tn, USA300 trpA::Tn pOS1, and USA300 trpA::Tn pOS1 trpA strains. (A) ODs were measured for the high-throughput phage host range assay replicates as described previously. (B) Agar plugs were removed with toothpicks and transferred to 0.8-ml volumes of sterile TMG, and bacteria were resuspended by vortexing. The resuspensions were serially diluted in TMG, and 4 μl of 1e−1 through 1e−6 dilutions were spotted four times on TSA plates. Dilution plates were grown overnight at 37°C, and colonies were counted the following day to determine the number of surviving CFU under each condition. Download FIG S5, TIF file, 0.4 MB (420.5KB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
High-throughput host range assay phenotypes demonstrating genetic validation of novel GWAS phage host range determinants. Results are grouped by gene (trpA, phoR, isdB, sodM, fmtC, and relA) and phage (p0045, p0017S, p003p, p0040, p0006, p002y, pyo, and no phage). Each group includes four strains demonstrating complementation with proper controls (USA300, USA300 transposon mutant, USA300 transposon mutant with empty pOS1 vector, and USA300 transposon mutant complemented with gene in pOS1 vector). All significant (P < 0.05) pairwise differences (Wilcoxon signed-rank test) are shown at the top of the corresponding box plots. Siphoviridae are listed in red, Myoviridae in blue, and the no-phage control in gray. Download FIG S6, PDF file, 0.1 MB (110.8KB, pdf) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Efficiency of plating (EOP) phenotypes demonstrating genetic validation of phage host range determinants. Undiluted through 1e−8 dilutions of phage were spotted (4 μl) three times on each top agar lawn, allowed to dry, and incubated face up overnight at 37°C, and plaques were counted at the lowest countable dilution. EOP was calculated relative to the average PFU/ml for the control strain, USA300 JE2. Results are grouped by gene (trpA, phoR, isdB, sodM, fmtC, and relA) and phage (p0045, p0017S, p003p, p0040, p0006, p002y, and pyo). Siphoviridae are listed in red, and Myoviridae in blue. Each group includes four strains demonstrating complementation with controls (USA300, USA300 transposon mutant, USA300 transposon mutant with empty pOS1 vector, and USA300 transposon mutant complemented with gene in pOS1 vector). All significant (P < 0.05) pairwise differences (Wilcoxon signed-rank test) are shown at the top of the corresponding boxplots. Download FIG S7, PDF file, 0.04 MB (38.8KB, pdf) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Construction of neural network predictive models for each ternary phage resistance phenotype. Quantitative host range phenotypes were classified as sensitive (S), semisensitive (SS), or resistant (R) based on the bins (OD600, 0.1 to 0.4, 0.4 to 0.7, and 0.7 or more, respectively). Data preprocessing included oversampling (p0045, p0017S, p002y, p003p, or pyo), lasso regression (p0017), both (p0006), or neither (p0040). (A) Predictive accuracies for each phage based on neural networks and four sets of predictors: all significant GWAS genetic determinants (COGs, SNPs, and k-mers) for a particular phage, all determinants plus corresponding strain sequence type and clonal complex (ST and CC), significant k-mers for a particular phage, and significant k-mers plus strain ST and CC. Average accuracies of four replicates are presented with 1 standard error above and below the mean. Validation accuracy represents the proportion of correctly identified ternary phenotypes in the validation set (30% of the strain set). (B) Average accuracies from four replicates and all significant GWAS determinants as predictors relative to the proportion of each ternary phenotype (S, SS, or R) among tested strains for the corresponding phage. Three points on the same horizontal are shown for each validation accuracy result (corresponding to each of the three possible phenotypes). (C) Average accuracies from four replicates and all significant GWAS determinants as predictors relative to the information entropy for each host range phenotype, which was calculated as described in Materials and Methods. Information entropy was calculated with a natural logarithm in natural units (nats). Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple. Download FIG S8, TIF file, 1.0 MB (1MB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Evaluation of ternary phage resistance phenotype predictive models through receiver operating characteristic (ROC) area under the curve (AUC). Quantitative host range phenotypes were classified as sensitive (S), semisensitive (SS), or resistant (R) based on the bins 0.1 to 0.4, 0.4 to 0.7, and 0.7 or more (OD600), respectively. Data preprocessing included oversampling (p0045, p0017S, p002y, p003p, or pyo), lasso regression (p0017), both (p0006), or neither (p0040). (A) Tenfold cross-validation ROC AUCs for each phage based on two model building methods (randomForest and XGBoost) and four sets of predictors: all significant GWAS genetic determinants (COGs, SNPs, and k-mers) for a particular phage, all determinants plus corresponding strain sequence type and clonal complex (ST and CC), significant k-mers for a particular phage, and significant k-mers plus strain ST and CC. Average ROC AUCs of four 10-fold CV replicates are presented with q standard error above and below the mean. (B) Average ROC AUCs from four 10-fold CV replicates for each model building method and all significant GWAS determinants as predictors relative to the proportion of each ternary phenotype (S, SS, or R) among tested strains for the corresponding phage. Three points are shown for each ROC AUC (corresponding to each of the three possible phenotypes). (C) Average ROC AUCs from four 10-fold CV replicates for each model building method and all significant GWAS determinants as predictors relative to the information entropy for each host range phenotype, which was calculated as described in Materials and Methods. Information entropy was calculated with a natural logarithm in natural units (nats). Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple. Download FIG S9, TIF file, 1.5 MB (1.5MB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Evaluation of ternary phage resistance phenotype neural network predictive models through receiver operating characteristic (ROC) area under the curve (AUC). Quantitative host range phenotypes were classified as sensitive (S), semisensitive (SS), or resistant (R) based on the bins 0.1 to 0.4, 0.4 to 0.7, and 0.7 or more (OD600), respectively. (A) ROC AUCs for each phage based on neural network models and four sets of predictors: all significant GWAS genetic determinants (COGs, SNPs, and k-mers) for a particular phage, all determinants plus corresponding strain sequence type and clonal complex (ST and CC), significant k-mers for a particular phage, and significant k-mers plus strain ST and CC. Average ROC AUCs of four replicates are presented with 1 standard error above and below the mean. (B) Average ROC AUCs from four replicates and all significant GWAS determinants as predictors relative to the proportion of each ternary phenotype (S, SS, or R) among tested strains for the corresponding phage. Three points are shown for each ROC AUC (corresponding to each of the three possible phenotypes). (C) Average ROC AUCs from four replicates and all significant GWAS determinants as predictors relative to the information entropy for each host range phenotype, which was calculated as described in Materials and Methods. Information entropy was calculated with a natural logarithm in natural units (nats). Siphoviridae are listed in red, Myoviridae in blue, and Podoviridae in purple. Download FIG S10, TIF file, 1.0 MB (1MB, tif) .
Copyright © 2021 Moller et al.
This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.