Abstract
Endolysins are produced by (bacterio)phages and play a crucial role in degrading the bacterial cell wall and the subsequent release of new phage progeny. These lytic enzymes exhibit a remarkable diversity, often occurring in a multimodular form that combines different catalytic and cell wall-binding domains, even in phages infecting the same species. Yet, our current understanding lacks insight into how environmental factors and ecological niches may have influenced the evolution of these enzymes. In this study, we focused on phages infecting Streptococcus thermophilus, as this bacterial species has a well-defined and narrow ecological niche, namely, dairy fermentation. Among the endolysins found in phages targeting this species, we observed limited diversity, with a singular structural type dominating in most of identified S. thermophilus phages. Within this prevailing endolysin type, we discovered a novel and highly conserved calcium-binding motif. This motif proved to be crucial for the stability and activity of the enzyme at elevated temperatures. Ultimately, we demonstrated its positive selection within the host's environmental conditions, particularly under the temperature profiles encountered in the production of yogurt, mozzarella, and hard cheeses that rely on S. thermophilus.
Keywords: bacteriophages, endolysins, host adaptation, experimental evolution, Streptococcus thermophilus, dairy fermentation
Introduction
Bacteriophages, also referred to as phages, are bacterial viruses that are found in the same ecological niches as their hosts. They are highly diverse and are classified into numerous viral families based on their genome composition (Hambly and Suttle 2005; Dion et al. 2020). Phages play a pivotal role in regulating bacterial populations by shaping their evolution through selective pressure and gene transfer (Clokie et al. 2011). They can also serve as biocontrol agents to prevent or suppress bacterial infections in plants, animals, and humans and are being actively considered as an alternative or complement to antibiotics in response to multidrug-resistant infections (Lin et al. 2017). However, for decades, phages have also been considered “enemies” in various industrial processes where they sometimes kill bacteria that play a beneficial role in fermentation or in biotechnology. For instance, phage-induced lysis of starter cultures remains the primary cause of milk fermentation delays in the cheese industry, resulting in low-quality products (Garneau and Moineau 2011).
The ability of phages to lyse their bacterial host at the end of their lytic cycle involves the coordinated action of at least two types of viral proteins: holins and endolysins (Fernandes and Sao-Jose 2018). The holin protein is synthesized intracellularly to generate pores in the bacterial plasma membrane. These pores allow the endolysin, a peptidoglycan hydrolase, to enter the extracytoplasmic space and degrade the cell wall (Wang et al. 2000).
The cell wall is a vital structure composed of a complex meshwork of N-acetylglucosamine (GlcNAc)-N-acetylmuramic acid (MurNAc) glycan strands that are cross-linked by short stem peptides attached to MurNAc residues (Vollmer et al. 2008). While all phage endolysins have the same function of cleaving peptidoglycan, their structure and mode of action can differ significantly. In general, endolysins that target Gram-positive bacteria have a modular structure that includes one or more enzymatically active domain (EAD) connected to a cell wall-binding domain (CBD) by a short linker (Schmelcher et al. 2012). Based on the peptidoglycan bond they hydrolyze, endolysins are currently classified into four primary groups. These include glucosaminidases and muramidases that hydrolyze glycan chains as well as amidases and endopeptidases that hydrolyze the peptidoglycan stem peptides or interconnecting bridges. The CBD domain provides high bacterial specificity and helps optimally position the enzyme to degrade its peptidoglycan substrate (Loessner et al. 2002; Hermoso et al. 2003). Overall, EAD and CBD can be very different, and their various combinations have resulted in a high degree of diversity throughout phage evolution (Fernández-Ruiz et al. 2018; Vázquez et al. 2021).
The diversity of the endolysin domains could result from their evolution to target essential cell wall components (Fischetti 2008) or as an adaptation to preferred or new phage hosts (Loessner et al. 2002). In a recent study on phages infecting the Gram-positive dairy bacterium Lactococcus lactis, we demonstrated that the diversity of endolysins was mostly the result of host adaptation (Oechslin et al. 2022). We found that diverse endolysin genes could be exchanged between phages from different viral genera with minimal fitness costs, provided both the donor and recipient phages infected the same bacterial strain.
While these findings indicate a clear dependence between endolysin and the host, our current understanding lacks insights into how environmental factors and ecological niches may have additionally shaped the evolution of these enzymes. Here, we focus on the bacterial host Streptococcus thermophilus. This species is extensively used in the industrial production of yogurt and hard cheeses as it is well adapted to the milk environment (Bolotin et al. 2004; Hols et al. 2005). Additionally, this bacterium is also susceptible to a set of diverse phages, spanning at least five viral genera (Philippe et al. 2020).
Interestingly, we observed a limited diversity of endolysins in phages infecting this bacterial species, with a singular structural type dominating in over 90% of identified S. thermophilus phages and occurring in four out of the five phage genera. Within this prevailing endolysin type, we discovered a novel and highly conserved calcium-binding motif (CaBM) that is crucial for the stability and activity of the enzyme at elevated temperatures. We demonstrated its positive selection under the temperature profiles encountered during the production of yogurt, mozzarella, or hard cheese, where thermophilic lactic acid bacteria such as S. thermophilus are used.
Results
Endolysin Diversity in Phages Infecting S. thermophilus Is Limited, With a Specific Structural Type Prevailing
To assess the diversity of endolysins present in streptococcal phages, we first retrieved 195 phage genomes infecting S. thermophilus from NCBI and analyzed their phylogenetic relatedness and gene conservation (Fig. 1). Notably, these phages were sourced from various milk fermentation samples across a wide range of geographical locations (22 countries) and different time periods, mitigating potential biases stemming from a few dairy manufacturers publishing the majority of S. thermophilus phage genomes (supplementary table S1, Supplementary Material online). The phage genomes represented the five phage genera infecting S. thermophilus, namely, Moineauvirus (former cos), Brussowvirus (former pac), Vansinderenvirus (former 5093), 987, and P738. Based on in silico predictions, we identified five structural types, of which one type (group A) was clearly predominant, as it was observed in 91.3% (n = 178/195) of the phage genomes analyzed and was distributed among in four of the five phage genera (except P738) (Fig. 1A). This type of endolysins had a predicted N-terminal catalytic amidase_5 domain belonging to the N1pC/P60 family of cell wall peptidases and a C-terminal region homologous to the target recognition domain of zoocin A, a Zn metallopeptidase secreted by Streptococcus zooepidemicus (Anantharaman and Aravind 2003; Chen et al. 2013). Remarkably, this specific allele showed the highest level of conservation in the entire streptococcal phage pangenome (Fig. 1B). It also appeared to be less functionally constrained compared to other conserved alleles coding for nonstructural proteins like holins, or structural proteins such as major capsid, tail, or portal proteins (Fig. 1B).
Fig. 1.
Distribution of endolysin types in streptococcal phage genomes. (A) A set of 195 complete phage genomes infecting S. thermophilus, available on NCBI as of February 26, 2023, were analyzed for endolysin diversity. Multiple sequence alignments were performed using MUSCLE (v3.8.425), and the phylogenetic tree was generated using Neighbor-Joining and a Jukes–Cantor genetic distance model. Conserved enzymatically active domain (EAD) and cell wall-binding domains (CBD) were predicted using HHpred, BLASTP, and AlphaFold. Endolysins were classified into five groups based on their catalytic domains and are represented by different colors. Group A endolysins, with an average predicted molecular weight of 30.8 ± 0.3 kDa, were found in 91% (n = 178) of the phage genomes. They have a predicted N-terminal catalytic amidase_5 domain that belong to the N1pC/P60 family of cell wall peptidases and a C-terminal region homologous to the target recognition domain of zoocin A (Anantharaman and Aravind 2003; Chen et al. 2013). The EAD domain belongs to the cysteine/histidine-dependent amidohydrolase/peptidase family (CHAP), as the characteristic cysteine (C35) and the histidine (H101) residue was conserved (supplementary fig. S1, Supplementary Material online) (Rigden et al. 2003). Half of the group A endolysin genes were disrupted by a group IA2 intron, as reported previously (Foley et al. 2000). The intron found in the endolysin gene of phage 2972 (Brussowvirus) is provided as an example. Group B endolysins (2.6%, n = 5, 32.8 ± 1.0 kDa) were predicted to be muramidases containing a glycosyl hydrolase family 25 (GH25) catalytic domain and a C-terminal region that lacked homology with known CBDs. Endolysins from group C (4.6%, n = 9, 50.8 ± 0.1 kDa) had the same predicted overall architecture as those in group B but with an additional N-terminal CHAP domain that showed homology with the Enterococcus phage endolysin LysIME-EF1 (Zhou et al. 2020). Group D endolysins (1.0%, n = 2, 27.4 ± 0.1 kDa) were rare but also had an N-terminal CHAP domain homologous to LysIME-EF1 endolysins but with a C-terminal domain without homology to known CBDs. Finally, the group E endolysin (0.5%, n = 1, 45.55 kDa) contained only one member, namely the one from the phage CHPC1109. This very rare endolysin was predicted to have a GH25 EAD muramidase and two additional C-terminal LysM domains. AlphaFold structure prediction revealed an additional third domain of unknown function that was not predicted by either BLAST or HHpred (supplementary fig. S2, Supplementary Material online). Accession numbers for each endolysin are followed by the name of the phage from which it was found, and each phage genus is depicted by a unique color. (B) The gene conservation of all genes in the S. thermophilus phage genomes is depicted through protein identity and the number of phage genomes that contain each gene (density plot on the x axis). Additionally, the dn/ds substitution rate and gene size are represented by the point color and size, respectively. The ten most prevalent genes are annotated for reference.
Endolysins of the Predominant Group A All Possess a Highly Conserved Calcium-binding Motif
To investigate the physicochemical properties crucial for the adaptation of phage endolysins to the ecological niche of S. thermophilus, we selected the endolysin from phage 2972 (Lys2972, group A) due to its significant conservation among streptococcal phages.
The determination of the crystal structure of Lys2972 confirmed substantial homology with known structures. The EAD (residues 1 to 149) displayed similarity to members of the cell wall peptidase superfamily NlpC/p60, including the bacterial peptidoglycan hydrolase SagA from Enterococcus faecium (Fig. 2A and supplementary fig. S3, Supplementary Material online) (Kim et al. 2019). Similarly, the CBD (residues 155 to 280) exhibited structural homology with the target recognition domain (TRD) of the bacterial Zoocin A, which was previously determined by NMR (Chen et al. 2013) (Fig. 2A and supplementary fig. S3, Supplementary Material online). In addition, Lys2972 had a relatively short and rigid linker segment (residues 150 to 154), leading to significant contacts between the two domains and an overall compact enzyme structure.
Fig. 2.

Crystal structure of Lys2972. (A) The crystal structure of Lys2972 was determined in the C2 space group, with one molecule per asymmetric unit. All the residues, except for 235 to 237 and the final three residues (278 to 280) at the C-terminal position, were successfully built. The endolysin is color coded in a rainbow scheme with the N-terminal in blue and the C-terminal in red. Both the EAD (residues 1 to 149) and CBD (residues 155 to 280) domains are labeled and the metal ion (Na+ or Ca2+) bound to the CBD domain is shown as a purple sphere. (B) The interactions between the metal (represented as Na+ in this structure) and the main chain carbonyls of W209 and I217, the side chain oxygen atoms of N208, D216, and D219, and a water molecule are depicted as dashed lines. (C) Graphical representation of the multiple sequence alignment of the CaBM region found in the 178 group A endolysin sequences used in this study. (D) Surface representation of Lys2972 colored according to its electrostatic potential (ranging from −5 kBT/e [red] to 5 kBT/e [blue]) and calculated using PyMol (Schrödinger). The metal-binding site is largely negatively charged and the enzyme is shown in the same orientation as in panel A.
Interestingly, a metal ion was identified at the edge of the CBD domain and was observed to be coordinated by the carboxylate groups of Asp216 and Asp219, the main chain carbonyl groups of Trp209 and Ile217, as well as the side chain carbonyl of Asn208 and a water molecule (Fig. 2B). While this novel metal-binding site and its corresponding residues were absent in the previously mentioned Zoocin A TRD structure, they were remarkably conserved in all group A endolysins (Fig. 2C). An electrostatic surface charge calculation further revealed a heavily negatively charged patch at this position (Fig. 2D), consistent with its metal-binding capacity. Considering that the coordination number of the metal is six, all the atoms in the coordination sphere of this metal are oxygens (coming from carboxylate, carbonyl, or water molecules), and the metal-ligand distances range between 2.2 and 2.5 Å, with an average value of 2.41 Å, this metal is expected to be either calcium or sodium (Zheng et al. 2017). Although refinement with Na+ has resulted in a temperature factor more in line with that of the coordinating oxygen atoms, this may be attributed to the low pH (5.0), which diminishes the binding capacity of D216 and D219 for divalent calcium ions, and the high abundance of sodium ions in the crystallization condition (>100 mM). However, it can be presumed that under neutral pH conditions and/or in the presence of higher calcium concentration found in milk, calcium is likely the preferred ion bound to this metal-binding motif. Consequently, this specific motif was further designated as a calcium-binding motif (CaBM).
The CaBM Is Important for the Thermal Stability of Group A Endolysins
Following the structural analysis described above, the role of the CaBM in the lytic activity of Lys2972 was investigated by inactivating it. The amino acids providing their side chains for metal coordination were replaced by alanine to construct single mutants (Lys2972 > N208A, Lys2972 > D216A, and Lys2972 > D219A) as well as a triple-mutant designated Lys2972CaBM (Lys2972 > N208A + D216A + D219A). After Ni-affinity and size exclusion chromatography, the purity and molecular weight of each endolysin was confirmed on 4% to 12% BisTris gels (supplementary fig. S4, Supplementary Material online).
Unexpectedly, we observed that the CaBM had only a limited contribution to the cell lysis activity of the enzyme compared to the overall contribution of the CBD at 37 °C. The Lys2972CaBM mutant demonstrated a 32.3 ± 2.4% decrease in activity compared to the wild-type (at a protein concentration of 1 µM, Fig. 3A). In contrast, a mutant lacking the entire CBD (Lys2972ΔCBD) exhibited a hundred-fold decrease in activity. At a concentration of 0.2 µM, the wild-type Lys2972 achieved a cell turbidity reduction of approximately 60% after 15 min (Fig. 3A), whereas Lys2972ΔCBD required a concentration of 20 µM to attain a comparable level of activity (supplementary fig. S5B, Supplementary Material online). We also did not observe a significant reliance of the enzyme on calcium for its activity. Chelation of calcium using ethylenediaminetetraacetic acid (10 mM) resulted in only an 11.5 ± 1.5% decrease in Lys2972 activity (supplementary fig. S5A, Supplementary Material online), while the presence of 10 mM CaCl2 led to a 24.4 ± 2.0% increase, a phenomenon not observed in the Lys2972CaBM mutant (Fig. 3A). This enhancement in lytic activity was also noted for Lys2972 > N208A (11.4 ± 1.3%) and Lys2972 > D216A (18.0 ± 2.4%) mutants, albeit to a lesser extent compared to wild-type Lys2972 (29.6 ± 1.9%) (at a protein concentration of 0.5 µM, Fig. 3B). Notably, the Lys2972 > D219A mutant did not exhibit any increase in enzymatic activity in the presence of calcium, underscoring the crucial role of D219 position in calcium binding.
Fig. 3.
Importance of CaBM for the lytic activity and heat stability of Lys2972. (A) Characterization of the lytic activity of Lys2972 and its inactivated CaBM mutant (Lys2972CaBM) at various concentrations and with or without the addition of 10 mM CaCl2. The lytic activity was determined by measuring the decrease in turbidity after 15 min of S. thermophilus DGCC7710 cells in exponential growth phase. (B) Characterization of the lytic activity of Lys2972 and CaBM derivative mutants (0.5 μM) after the addition of 0.5, 2, or 10 mM CaCl2 compared to values obtained without the addition of CaCl2. (C) Graph of the melting temperature (Tm) of Lys2972 and Lys2972CaBM with or without the addition of 10 mM CaCl2. The thermal shift assay was performed using SYPRO Orange, and the Tm (lowest part of the curve) was determined by plotting the first derivative of the fluorescence emission as a function of 0.5 °C temperature steps. (D) Graph of the thermal stability of Lys2972 and Lys2972CaBM after a 15-min incubation step at various temperatures and with or without 10 mM CaCl2. The lytic activity was determined by measuring the decrease in turbidity after 15 min of S. thermophilus DGCC7710 cells in the exponential growth phase. Experiments were performed in triplicate.
The relatively low dependence of Lys2972 on the CaBM for lytic activity suggests that it may play a more important role in other functional aspects of the enzyme. Considering that other studies have shown that calcium can increase the heat stability of various proteases (Kirberger and Yang 2013; Milles et al. 2018), and recognizing that S. thermophilus is a thermophilic lactic acid bacterium that thrives at elevated temperatures compared to other bacterial genera found in milk environments, we conducted a thermal shift assay to compare the heat stability of Lys2972 and its CaBM mutants. Our results indicated that in the presence of 10 mM CaCl2, the melting temperature of the wild-type Lys2972 increased from 54.5 to 60.0 °C (Fig. 3C). However, no such increase was observed with the mutants Lys2972CaBM, Lys2972 > N208A, Lys2972 > D216A, and Lys2972 > D219A or when 10 mM of ethylenediaminetetraacetic acid was added to Lys2972 (Fig. 3C and supplementary fig. S6, Supplementary Material online). The importance of CaBM for the stability of Lys2972 was further substantiated by testing the cell lysis activity of the endolysin at various temperatures. In the presence of 10 mM CaCl2, Lys2972 retained its lytic activity at temperatures up to 56.6 °C, and such temperature-dependent activity needed CaCl2 or an active CaBM (Fig. 3D), validating this unique metal-binding motif.
Inactivation of CaBM in phage 2972 leads to temperature-dependent fitness costs at temperatures that typically occur during yogurt production. S. thermophilus is essential for the production of yogurt, a process that requires high incubation temperatures, typically ranging from 40 to 46 °C in industrial settings (Lee and Lucey 2004). Given this context, we investigated whether temperature could affect the replication of a phage mutant with an inactivated CaBM. This was achieved by replacing the three critical amino acids (N208A, D216A, D219A) responsible for coordinating calcium with alanines using CRISPR-Cas9 genome editing. The resulting phage mutant, designated 2972CaBM, produced clear plaques at 37 °C (supplementary fig. S7, Supplementary Material online). However, while the lytic plaques produced by phages 2972 (0.34 ± 0.12 mm2) and 2972CaBM (0.33 ± 0.11 mm2) showed no statistical differences at 37 °C (P-value = 0.1716), the 2972CaBM mutant produced smaller plaques at 42 °C (0.28 ± 0.09 vs. 0.25 ± 0.09 mm2, P ≤ 0.001) and at 46 °C (0.17 ± 0.05 vs. 0.11 ± 0.04 mm2, P ≤ 0.0001) (Fig. 4B). Furthermore, we conducted competition assays between phages 2972 and 2972CaBM at a low multiplicity of infection, reflecting the initial infection dynamics within an industrial environment (Marcó et al. 2012). Observations revealed a fitness cost associated with the loss of CaBM, particularly noticeable at elevated temperatures (−0.31 ± 0.03 at 37 °C, −0.57 ± 0.07 at 42 °C, and −0.89 ± 0.08 at 46 °C) (Fig. 4C). Additionally, we observed a linear correlation between the relative fitness and variations in plaque size of phages 2972 and 2972CaBM across different temperatures (Fig. 4D).
Fig. 4.

Importance of the CaBM in the lytic activity of phage 2972 at different temperatures. (A) Size of lytic plaques produced by phage 2972 or 2972CaBM when grown at different temperatures (nsP > 0.05, ***P ≤ 0.001, ****P ≤ 0.0001) using Welch’s two-sample t-test). The bacterial host S. thermophilus DGCC7710 was used for plaque visualization using a double-layer agar assay. (B) Impact of CaBM inactivation on phage 2972 fitness. A direct competition assay was performed by co-amplifying both phages 2972 and 2972CaBM at an initial MOI of 0.0001 and at different temperatures. After overnight incubation, the population of phage 2972CaBM was quantified using the strain S. thermophilus DGCC7710 BIM/Lys2972 (resistant to phage 2972 but not 2972CaBM) and compared with the titer of the total phage population obtained using the strain S. thermophilus DGCC7710. (C) Linear correlation between relative fitness and difference in plaque size between phages 2972 and 2972CaBM.
CaBM Also Provides a Fitness Advantage Under Thermal Conditions Used in Cheese Production
S. thermophilus is not only used as a starter culture in yogurt production but also in the manufacturing of Italian and Swiss hard cheeses such as Parmigiano Reggiano, Gruyère, and Emmental (Gobbetti and Calasso 2014). In the production of these cheeses, temperatures that exceed the streptococcal maximal growth threshold are often used. After fermentation, the milk is curdled and cooked at temperatures ranging from 45 to 60 °C to reduce excess water content, resulting in cheeses with a hard consistency (Di Cagno and Gobbetti 2011). Similarly, S. thermophilus is used in the production of mozzarella cheese, where the curd undergoes a brief heating and stretching process at temperatures ranging between 55 and 65 °C (Yu and Gunasekaran 2005).
To assess whether the thermostable properties of CaBM could enhance the resilience of the phage lytic cycle to temperature fluctuations exceeding the growth maxima of its host, we initially subjected S. thermophilus cells infected with either phage 2972 or 2972CaBM to a 15-min heat shock (60 °C) at different time intervals after infection. A previous study on the global gene expression of phage 2972 revealed that lys2972 was maximally expressed 27 min after infection (Duplessis et al. 2005). We observed that when infected cells were subjected to heat shock within the first 20 min of infection, no significant differences were observed between phages 2972 and 2972CaBM (supplementary fig. S8A to C, Supplementary Material online). In both cases, bacterial lysis was delayed until host growth resumed. However, when the heat shock was applied 30 min after infection or later, bacterial lysis occurred only in cells infected with 2972 and not with 2972CaBM (supplementary fig. S8D to F, Supplementary Material online). Exposure of infected cells to 55 °C also revealed a similar trend, albeit with a significantly shorter delay in lysis time for 2972CaBM compared to the 60 °C stress (supplementary fig. S9, Supplementary Material online).
We then investigated whether CaBM would provide a fitness advantage under conditions simulating hard cheese production. This included the infection of the cells at a low MOI of 0.0001 [29] (Marcó et al. 2012) and the exposure of the infected cells to a temperature profile used for the production of Emmental cheese (Fig. 5A) (Vuillemard 2018). Under these thermal conditions, we observed that cell cultures infected with phage 2972 exhibited earlier lysis than those infected with 2972CaBM. This was supported by the consistently shorter time at maximum rate (t at max V) of cell lysis kinetics for 2972 compared to 2972CaBM (Fig. 5B to D). Conversely, no significant differences were observed between the two phages in the control condition where cells were infected at the same MOI and at a constant temperature of 40 °C (Fig. 5E). Finally, the CaBM was again found to provide a selective advantage under these thermal conditions during a competition test between phages 2972 and 2972CaBM. The fitness cost associated with the absence of a functional CaBM gradually increased with the duration of the first incubation step at 35 °C (−0.52 ± 0.06 for 60 min, −0.62 ± 0.05 for 70 min, and −0.83 ± 0.1 for 80 min). The fitness cost on phage 2972CaBM was consistently higher than that observed at a constant temperature of 40 °C, but not for a 60-min incubation step at 30 °C (P-value = 0.1054 for 60 min, P-value = 0.0032 for 70 min, and P-value = 0.0016 for 80 min).
Fig. 5.
Lytic activity of phages 2972 and 2972CaBM under thermal conditions used in hard cheese production. (A) S. thermophilus DGCC7710 cells early in the exponential phase were infected with phage 2972 or 2972CaBM at a MOI of 0.0001. Immediately after infection, cells were incubated in a programmable water bath with a temperature profile used to produce Emmental (Vuillemard 2018). The first incubation step at 35 °C was tested for different time periods: (B) 60 min early in the exponential phase, (C) 70 min, or (D) 80 min. Then, the temperature was raised to 55 °C over 20 min, maintained for another 30 min, and gradually lowered to 40 °C over 60 min to simulate the cooling and pressing of the cheese curd. After completion of the thermal cycle, the cells were transferred to a microplate reader and turbidity was measured every 15 min over a 16-h period. (E) A control was performed by incubating the infected cells at a constant temperature of 40 °C. (F) Direct competition assay for relative fitness between phages 2972 and 2972CaBM under the thermal conditions used to produce pressed cooked cheese. Equal amounts of phages 2972 and 2972CaBM were competed against each other at a MOI of 0.0001 and under incubation conditions used in (B) through (E). The titer of phage 2972CaBM after an amplification step was determined on strain S. thermophilus DGCC7710 BIM250 (resistant to phage 2972 but not to 2972CaBM) and compared with the titer of the total phage population obtained on strain DGCC7710.
Finally, we confirmed that this fitness advantage can be attributed to the ability of phages with a functional CaBM to lyse cells at 55 °C. To verify this, phage-infected cells were subjected to a 150-min incubation at 55 °C, starting 45 min post-infection (supplementary fig. 10A to C, Supplementary Material online). At 55 °C, bacterial growth was inhibited, yet phage 2972 retained its ability to lyse the host cells, whereas phage 2972CaBM did not. These results align with the observation that Lys2972 could only lyse cells at 55 °C in the presence of a functional CaBM and calcium, as depicted in supplementary fig. 10D, Supplementary Material online.
Phages With a Functional CaBM Are Selected at Temperatures That Prevail During the Production of Yogurt and Cheese
To further substantiate the importance of the CaBM under thermal conditions encountered during the production of milk fermented products, such as yogurt or cheeses, we generated a phage mutant with the CaBM inactivated through a single nucleotide substitution (2972D219A). The D219 position of Lys2972, which is known for its importance for heat stability, was changed by the introduction of an alanine residue through a single nucleotide change at the second position of the codon (A to C). After subjecting the phage population to experimental evolution (30 transfers) under different thermal conditions (Fig. 6A and B), we quantified the frequency of the C to A reversion in the CaBM using Illumina MiSeq sequencing. Notably, we did not observe any significant reversion toward the functional wild-type CaBM when the phage mutant population was evolved at 37 °C. The SNP frequency remained below the detection limit, with a mean frequency of 0.000767 ± 0.000089 at 37 °C compared to 0.00125 ± 0.00077 for the control (see Fig. 6C and supplementary fig. S11, Supplementary Material online). However, a significant enrichment in SNP frequency leading to a functional CaBM was observed when the phage mutant population was evolved at higher temperatures, such as 46 °C (0.148 ± 0.252), heat shock (0.0124 ± 0.0184), and cheese-making conditions (0.0142 ± 0.00736).
Fig. 6.
Quantifying the functional reversion of an inactivated CaBM during an experimental evolution assay. (A) A phage CaBM mutant with a D219A substitution was created by changing the adenine present in the second position of the codon to a cytosine. The phage mutant was then serially transferred with its host for a total of 30 transfers, each conducted with a final volume of 10 ml and a hundred-fold dilution rate. n = 3. (B) Throughout each of these transfers, phage-infected cells were exposed to a range of thermal conditions, including temperatures of 37 °C and 46 °C, as well as a 15-min heat shock at 60 °C. Additionally, infected cells were also exposed to thermal conditions that mimic those encountered during the production of pressed cooked cheese. (C) After the 30 transfers, targeted sequencing of the CaBM at the population level was performed with Illumina MiSeq to measure the SNP frequency (C to A reversion) under the different incubation conditions. The detection limit was set based on the initial SNP frequency found in the phage population before the transfers.
Discussion
The diversity of phage endolysins observed in both cultured and uncultured phage metagenomic data is extensive, encompassing hundreds of structural types that result from the association of various EADs and CBDs through phage evolution (Fernández-Ruiz et al. 2018; Vázquez et al. 2021). The biological relevance of such diversity likely arises from adaptation to the Gram-positive or Gram-negative nature of the cell wall, as well as changes in the peptidoglycan structure at the species level, with additional host-specific adaptations at the strain level (Oliveira et al. 2013; Vázquez et al. 2021; Oechslin et al. 2022). However, our understanding of how environmental factors and ecological niches may have impacted the diversity and evolution of these complex enzymes is limited. In this study, we have focused on the endolysins of phages infecting S. thermophilus, as this nonpathogenic streptococcal species has evolved in a well-defined and narrow ecological niche characterized mainly by milk-induced speciation.
We identified five distinct structural types of endolysins in phages infecting S. thermophilus, among which one type (group A) predominated, being observed in 91% (n = 178) of the analyzed phage genomes. Given the prevalence of this type of endolysin, we selected it as a model to gain insight into the physicochemical properties essential for the adaptation of a phage endolysin to the host environmental niche. Upon determining the crystal structure of the group A endolysin from phage 2972, we observed a distinctive calcium-binding motif (CaBM) conserved across all its representatives. Surprisingly, deletion of the CaBM, either in the purified enzyme or directly in phage 2972, resulted in only a marginal reduction in the endolysin's activity, with no measurable effect on phage replication. Previous studies have shown that calcium can modulate the flexibility of the substrate-binding regions of enzymes and influence their affinity, as in the case of proteinase K (Bajorath et al. 1989; Müller et al. 1994; Liu et al. 2011). Similarly, the presence of calcium, which promotes non-covalent interactions within the CBD of Lys2972, may influence the flexibility of its substrate-binding region. This modulation could, in turn, affect mechanisms such as induced fit and conformational selection, potentially contributing to the subtle variations observed in Lys2972 activity (Henzler-Wildman and Kern 2007; Lange et al. 2008). However, the limited reliance of Lys2972 on the conserved CaBM for its lytic activity could imply a more significant role for this motif in other aspects of the enzyme's function. Previous studies have indicated that calcium can increase the heat stability of various proteases (Kirberger and Yang 2013; Milles et al. 2018). Similarly, we also observed that the thermal stability of Lys2972 increased in the presence of calcium, an effect that was not observed when the CaBM was inactivated by alanine substitutions.
S. thermophilus holds significant economic importance in the dairy industry, serving as a thermophilic starter for the production of yogurt and certain types of cheese, including mozzarella and hard cooked cheese. These dairy products require high fermentation and processing temperatures. For instance, yogurt production typically involves fermentation temperatures between 40 and 46 °C (Lee and Lucey 2004), whereas in the production of mozzarella cheese, the curd must be heated and stretched to temperatures ranging from 55 to 65 °C over a short period of time (Yu and Gunasekaran 2005). Similarly, S. thermophilus’s ability to withstand high processing temperatures is crucial for the production of Swiss cheeses such as Emmental and Gruyère, as well as Italian cheeses like Grana Padano and Parmigiano Reggiano. During these cheese-making processes, the curd is typically cooked at temperatures ranging from 50 to 58 °C to reduce excess water content, resulting in cheeses with a hard texture (Di Cagno and Gobbetti 2011). Therefore, the unique ability of the CaBM to thermally stabilize endolysins in the presence of calcium, a component typically found in milk at about 10 mM (Lewis 2011), represents a key factor in the phage's adaptation to the S. thermophilus dairy fermentation niche.
To further support the adaptive significance of the CaBM in the dairy environment, we generated different phage mutants in which this motif was inactivated by alanine substitutions. Our observations revealed a significant decrease in fitness during competitive interactions with the wild-type phage, especially at temperatures and infection dynamics that occur in yogurt and cheese production. Through experimental evolution, we also confirmed that the thermal conditions encountered during the production of yogurt, pressed cooked cheese, or mozzarella (60 °C heat shock) can act as a strong bottleneck for selecting phages that have reverted to a functional CaBM and consequently possess thermostable endolysins.
Interestingly, the prevalence of group A endolysins suggests that the diversity of these enzymes among phages infecting S. thermophilus is limited. This is especially noticeable when compared to phages infecting other lactic acid bacteria, such as the mesophilic bacterium L. lactis, where 11 distinct groups of endolysins could be observed across 253 lactococcal phage genomes (Oechslin et al. 2022). The reduced diversity may be attributed not only to the high fermentation and processing temperatures, which serve as a strong bottleneck for selecting a limited subset of thermostable endolysins, but also to the lower genetic diversity of the host. In contrast to L. lactis, which was until recently classified into four subspecies and found in various sources such as plants, animals, and raw milk, S. thermophilus has a much lower genetic diversity and is rarely found outside the dairy environment (Delorme 2008; Passerini et al. 2010; Delorme et al. 2017; Li et al. 2021). Our recent findings suggest that the diversity of endolysins in lactococcal phages mostly results from unique adaptations to the bacterial host diversity (Oechslin et al. 2022). Therefore, the low genetic diversity of the hosts may have favored the use of a specific type of endolysin, selected for its thermostable properties by fermentation practices and adopted by most phage genera infecting this species.
In summary, our findings not only advance our understanding of phage–host interactions but also provide valuable insights into the intricate adaptations that enable phages to thrive in specific ecological niches. The centuries-long domestication of S. thermophilus by humans has induced genetic changes, including the loss of specific bacterial genes. Our study suggests that industrial practices like yogurt or cheese production serve as robust barriers, shaping the genetic profiles of streptococcal phages and promoting the selection of thermostable endolysins. This process may have constrained their diversity due to the low genetic variability of the host species, ultimately favoring the prevalence of a single adapted endolysin type.
Methods
Bacterial Strains, Phages, and Growth Conditions
The biological materials and plasmids used in this study are listed in supplementary table S2, Supplementary Material online. S. thermophilus DGCC7710 and phage 2972 were obtained from the Félix d’Hérelle Reference Center for Bacterial Viruses (www.phage.ulaval.ca). S. thermophilus strains were grown without agitation at 42 °C in M17 broth supplemented with 0.5% lactose (LM17) and 1.0% agar (for solid media). When needed, chloramphenicol was added to the growth medium at final concentration of 5 μg/ml (Cm 5). Escherichia coli strains were grown at 37 °C with agitation (220 rpm) in LB or plated on LB with agar (LBA). If necessary, LB was supplemented with kanamycin sulfate (30 μg/ml) or chloramphenicol (25 μg/ml for LBA plates and 50 μg/ml for LB). For phage infection, 10 mM CaCl2 was added to M17, and the double-layer plaque assays were performed as previously described (Kropinski et al. 2009).
Plasmid Construction
Plasmids were purified from overnight bacterial cultures using a QIAprep Spin Miniprep Kit (Qiagen). Polymerase chain reactions (PCR) were performed with Taq polymerase (Feldan) for screening purposes or Q5 high-fidelity DNA polymerase (New England Biolabs) for cloning and site-directed mutagenesis. Gibson assembly master mix was prepared as previously described (Gibson et al. 2009) and restriction enzymes were purchased from New England Biolabs. Primers used in this study are listed in supplementary table S2, Supplementary Material online.
Phylogenetic Analysis and Gene Conservation of Endolysins in Virulent Phages That Infect S. thermophilus
A total of 195 phage genomes that were annotated to infect S. thermophilus were retrieved from NCBI (02/26/2023). Then, they were analyzed for the presence of endolysin genes and deduced amino acid sequences. Group IA2 introns present in some of the endolysin genes were manually removed from group A endolysin sequences. MUSCLE (v3.8.425) was used to perform the multiple sequence alignment with a maximum of eight iterations (Edgar 2004). The phylogenetic tree was generated using Neighbor-Joining and a Jukes–Cantor genetic distance model. The visualization and re-rooting of the tree were done with Interactive Tree Of Life (iTOL) v5 (Letunic and Bork 2021). The domains were annotated with BLASTP, HHpred, and AlphaFold predictions (Altschul et al. 1997; Söding et al. 2005; Mirdita et al. 2022). The gene conservations were predicted by clustering the genes with cd-hit-est at 80% identity (Fu et al. 2012). The dn/ds ratios were calculated by calling single nucleotide polymorphisms on the MUSCLE alignments with freebayes and SNPeffect (De Baets et al. 2012).
Cloning, Expression, and Purification of Lys2972 and Its Mutant Derivatives
Because of the presence of a group IA2 intron in the gene encoding Lys2972, lys2972 was amplified from phage 2972 genomic DNA in two fragments with primer pairs Lys2972_F1_Fw/Lys2972_F1_Rv and Lys2972_F2_Fw/Lys2972_F2_Rv. The lys2972 fragments were then cloned into a pET28a expression vector using Gibson assembly. Primers with complementary ends (35 bp) were designed to bind either to the opposite lys2972 fragment or to a linearized pET28a vector. The linearized vector was obtained by PCR using the primer pairs pET28a_Fw/pET28a_rv and a BamHI-digested pET28a plasmid as a template. The subsequent plasmid, pLys297228a, was used as a template for site-directed mutagenesis to either introduce alanine substitutions into the Lys2972 CaBM and catalytic site or to generate a CBD-deficient variant. To accomplish this, the Q5 Site-Directed Mutagenesis Kit (NEB) was used with primer pairs designed using the manufacturer's online software (nebasechanger.neb.com). All constructs were confirmed by sequencing.
For protein purification, the cells were initially cultured overnight in 50 ml of TB media and 5 ml of culture was transferred to a medium volume of 800 ml to achieve an optical density (OD600nm) of ∼0.6 before the cells were induced with 1 mM isopropyl-beta-D-1-thiogalactopyranoside (IPTG) and cultured for an additional 16 h at 18 °C. The cell pellets were harvested and suspended in 45 ml of lysis buffer containing 50 mM Tris-HCl pH 7.5, 300 mM NaCl, 5% glycerol, 5 mM imidazole, and 1 mM PMSF. After incubation with DNase (5 μg/ml) and lysosome (2 mg/ml), the cells were lysed using sonication. The resulting lysate was subjected to ultracentrifugation, and the supernatant was then subjected to nickel affinity chromatography. The process involved multiple washes with a gradient imidazole-containing buffer, ranging from 5 to 20 mM. Proteins were subsequently eluted using a buffer containing 200 mM imidazole. The eluted protein was applied onto a Superdex 200 Increase (10/300GL) column for buffer exchange. The buffer used for this step consisted of 20 mM Tris-HCl pH 7.3, 150 mM NaCl, and 1% glycerol. The pooled fractions were concentrated to 15 to 20 mg/ml before crystallization setup.
Crystallization, Data Collection, and Structure Determination
The crystallization of Lys2972 was performed at room temperature via the micro-batch-under-oil approach, where the protein and reservoir solution were mixed at a 1:1 ratio (v/v). Lys2972 crystals were obtained using the reservoir solution containing 0.2 M LiCl, 0.1 M NaAc pH 5.0, and 20% PEG-6000. The crystals were cryoprotected by the reservoir solution supplemented with 15% ethylene glycol before data collection at the LRL-CAT (31-ID) beamline at Advanced Photon Source, Argonne National Laboratory. Data processing and scaling were performed with XDS (Kabsch 2010). Lys2972 crystals were in the C2 space group with unit cell parameters of a = 83.2, b = 64.8, c = 55.4 Å, and β = 122.7°. The structure was determined by Phaser-MR (McCoy et al. 2007) using the templates predicted by Robetta (Baek et al. 2021) followed by multiple cycles of ARP/wARP model building (Langer et al. 2008). To get the final structures, multiple cycles of refinement using Refmac5 (Murshudov et al. 1997) in CCP4 (Winn et al. 2011) followed by manual model rebuilding with Coot (Emsley and Cowtan 2004) were carried out. The stereochemistry of the final model has been analyzed with PROCHECK (Laskowski et al. 1993). Data collection and refinement statistics are shown in supplementary table S3, Supplementary Material online.
Evaluation of the Activity of Lys2972 and Derivatives on S. thermophilus Cells
The lytic activity of Lys2972 and its mutant derivatives was evaluated by monitoring the reduction in turbidity of a solution containing S. thermophilus DGCC7710 cells at an OD600nm of 0.4. Bacterial cells were resuspended in lysis buffer (20 mM Tris-HCl, 200 mM NaCl, pH 7.0) and mixed in a 96-well microplate with purified endolysin diluted to the desired concentration in endolysin buffer (20 mM Tris pH 7.3, 150 mM NaCl). CaCl2 was added to both the endolysin and lysis buffers as needed. The decrease in turbidity was measured at 37 °C using a microplate reader (BioTek Synergy HTX Multimode Reader).
Evaluation of the Heat Stability and Lytic Activity of Lys2972 and Its CaBM Mutants at Different Temperatures
To conduct thermal shift assays, solutions of Lys2972 and CaBM mutants (each at a concentration of 30 µM) were prepared with or without 10 mM calcium. Next, 20 μl of the prepared protein was mixed with 5 μl of 500-fold diluted JBS Thermofluor Dye (Jena Bioscience), prepared in the same buffer. After an initial incubation at 4 °C for 2 min, samples were gradually heated at a rate of 0.5 °C/min, ending at 95 °C, using a CFX96 Touch RealTime PCR Detection System (Bio-Rad). The FRET channel was used to collect data during the heating process, and the results were analyzed using the Bio-Rad CFX Maestro software.
The lytic activity of Lys2972 and Lys2972CaBM was also tested in the presence or absence of calcium and after an incubation of 15 min at different temperatures. Solutions of the two endolysins were prepared at a final concentration of 30 µM with endolysin buffer with or without 10 mM calcium. Each solution was then distributed into separate 500 µl PCR tubes in 100 µl aliquots and exposed to different temperatures using a thermal cycler. After thermal exposure, the tubes were centrifuged at 13,000 rpm for 2.5 min and the resulting supernatant was diluted tenfold before being tested for lytic activity using the same methodology as previously described.
To evaluate the lytic activity of Lys2972 and Lys2972CaBM at different temperatures, 100 μl of S. thermophilus DGCC7710 cells in the exponential phase (resuspended in lysis buffer) and 200 μl of endolysin (4 µM) were pre-incubated in 500 μl PCR tubes. The pre-incubation was done in a water bath at temperatures of 37 °C, 45 °C, 55 °C, or 60 °C, with 10 mM CaCl2 as required. After 15 min of pre-incubation, 100 μl of endolysin was added to the tubes containing the bacterial cells, and the decrease in turbidity was measured 15 min later using a DeNovix DS-11 Spectrophotometer.
Genome Editing of Phage 2972
CRISPR-Cas-based genome editing was used to inactivate Lys2972 CaBM in the phage 2972. To achieve this, a natural CRISPR BIM (Bacteriophage Insensitive Mutant) targeting the C-terminal part of Lys2972 was isolated as previously described (Hynes et al. 2017) and designated as S. thermophilus DGCC 7710 BIM-Lys2972 (supplementary table S2, Supplementary Material online). Subsequently, repair templates were created by introducing mutations into the Lys2972 CaBM motif. Specifically, the codons encoding residues N208, D216, and D219 were replaced with codons encoding alanine (phage mutant 2972CaBM, or alternatively, only D219 was replaced with alanine (phage mutant 2972D219A). The mutated CaBM was flanked on each side by approximately 250 bp of phage genomic homologous regions and an additional synonymous mutation in the adjacent protospacer motif (PAM) was inserted to prevent CRISPR-Cas interference with the repair template without changing. The gene fragment (gBlock) was synthesized with complementary ends (Integrated DNA Technologies, Inc.) for Gibson assembly at the XbaI restriction site of plasmid pNZ123 (De Vos 1987). Recombinant phages were isolated from plaques after the infection of S. thermophilus DGCC 7710 BIM-Lys2972 strain transformed with the repair template plasmid. Modified phages with the correct mutations were screened with primer pairs 2972_CaCl2_Fw and 2972_CaCl2_Rv. Subsequently, the genome of one of the phage mutants, designated as 2972CaBM, was fully sequenced.
Phage DNA Sequencing and Analysis
Phage genomic DNA was extracted following the procedures outlined elsewhere (Moineau et al. 1994). Libraries were prepared using the Nextera XT DNA library preparation kit (Illumina), adhering to the manufacturer's instructions. Sequencing was carried out on a MiSeq system, utilizing a MiSeq reagent kit v2 (Illumina). Subsequently, the reads underwent cleaning through Trimmomatic v0.36 (Bolger et al. 2014) and were assembled to obtain circular complete sequences, utilizing Ray v3.0.1 (Boisvert et al. 2010) and SPAdes v3.13 (Bankevich et al. 2012).
Quantification of Phage Lys2972 and CaBM Mutant Plaque Size and Fitness at Different Temperatures
To analyze the size of lytic plaques generated by phages 2972 and 2972CaBM, phage lysates were diluted to produce 50 to 100 lytic plaques per plate. The plates were then incubated at 37 °C, 42 °C, or 46 °C. The surface area of the lytic plaques was determined by image analysis as previously described (Oechslin et al. 2022).
To evaluate the fitness costs associated with CaBM inactivation, phages 2972 and 2972CaBM were first co-amplified at different temperatures. S. thermophilus DGCC7710 cells at an OD600nm of 0.2 were infected with both phages 2972 and 2972CaBM at an initial MOI of 0.0001 and incubated overnight at 37 °C, 42 °C, or 46 °C in different water baths. The fitness cost associated with CaBM inactivation was determined by calculating the difference between the titer of phage 2972CaBM and the titer of the total phage population. To determine the number of 2972CaBM particles in the mixed population, a plaque assay was performed using the strain S. thermophilus DGCC7710 BIM-Lys2972. This strain contains a spacer that specifically targets lys2972, resulting in a significant 5.0 ± 0.1 log pfu/ml reduction of the phage 2972 population. However, this reduction does not apply to 2972CaBM, as its PAM was modified to prevent any interference with the CRISPR-Cas system during genome editing. A parallel titration was performed with strain S. thermophilus DGCC7710 to determine the titer of the entire phage population.
Quantification of Phage Lys2972 and CaBM Mutant Lytic Activity After Heat Shock or Under Conditions That Mimic Cheese Production
To investigate the effects of CaBM inactivation on heat shock resistance, S. thermophilus DGCC7710 cells were cultured in glass tubes until they reached the early exponential phase. The cells were then infected with either phage 2972 or 2972CaBM at a MOI of 0.1. Following infection, the cultures were subjected to heat treatment for 15 min by immersing the tubes in a 60 °C water bath, either immediately or at 10-min intervals. After the heat shock treatment, the cultures were maintained in a water bath at 42 °C, and the cell turbidity was monitored every 30 min for 5.5 h using a SPECTRONIC 20+ spectrophotometer.
To investigate the impact of CaBM inactivation in thermal conditions that replicate the cheese-making process, S. thermophilus DGCC7710 cells were cultured in glass tubes until they reached the early exponential phase. The cells were then infected with either phage 2972 or 2972CaBM at a MOI of 0.0001. Subsequently, the tubes were transferred to a programmable water bath that replicated the temperature gradients during cheese production. This involved an initial incubation period at 35 °C for 60 to 80 min, followed by a temperature shift to 55 °C over 20 min, which was then maintained for an additional 30 min. Finally, the temperature was decreased to 40 °C over 1 h. After these incubations, the infected cells were transferred to a 96-well plate and incubated at 40 °C in a microplate reader and absorbance was measured every 15 min for 16 h (BioTek Synergy HTX Multimode Reader). Time at maximum rate (t at max V) of cell lysis kinetics was determined using BioTek Gen5 software (v3.12).
The fitness costs associated with CaBM were assessed by co-amplifying phages 2972 and 2972CaBM at a MOI of 0.0001, under the same thermal conditions that were used to replicate the cheese-making process. The fitness cost was determined by calculating the difference between the titer of phage 2972CaBM and the titer of the total phage population using strain S. thermophilus DGCC7710 BIM-Lys2972, as described previously.
Experimental Evolution of Lys2972 CaBM
The phage CaBM mutant 2972D219A was successfully amplified in triplicates for a total of 30 transfers. S. thermophilus DGCC7710 cells in the exponential growth phase (OD600nm of 0.2) were infected with an initial MOI of 0.01. Infected cells were then incubated at different temperatures: 37 °C, 46 °C, or for an additional period of 40 min at 42 °C before being subjected to a 15-min heat shock at 60 °C. Additionally, infected cells were also exposed to thermal conditions mimicking those encountered during the production of pressed cooked cheese, 40 min post-infection. After an incubation period of 12 h at 30 °C, phage lysates were filtered and diluted a hundred times before starting another round of amplification.
After 30 transfers, phage DNA was extracted, and the CaBM motif was sequenced using Illumina MiSeq V2. Libraries for high-throughput sequencing of PCR products were prepared following the “16S Metagenomic Sequencing Library Preparation” protocol by Illumina, using primers with extensions containing the adapters and the MiSeq Reagent Kit v2 (CaBM_D219A_Fw and CaBM_D219A_Rv, supplementary table S2, Supplementary Material online). Adaptors and low-quality bases were trimmed from the raw reads using trimGalore with the default parameters. Subsequently, the filtered reads were mapped to the 200 bp reference using bwa mem (Li and Durbin 2009). Over 99.2% of reads mapped in all samples, and the reference was sequenced with an average depth of 106 coverage (supplementary fig. S10, Supplementary Material online). Single nucleotide polymorphisms (SNPs) were called using freebayes (-F 0.01 -p 12 –pooled-discrete) (Garrison and Marth 2012). To establish the detection limit, we considered 0.002, which represented the highest SNP frequency observed in the control. Coincidentally, the default Phred quality score of 40 corresponds to the 0.002 technical error rate, serving as the technical error threshold.
Supplementary Material
Supplementary material is available at Molecular Biology and Evolution online.
Supplementary Material
Acknowledgments
The authors would like to thank Denise Tremblay, Genevieve Rousseau, Dr. Alessandra Gonçalves de Melo, Rachel Morin-Pelchat, Zacharie Morneau, and Dr. Harald Brüssow for discussion and technical help. We thank Amanda Toperoff and Michi Waygood for editorial assistance. F.O. was supported by a post-doctoral fellowship from the Swiss National Science Foundation under grant P400PB_191059. C.M. and X.Z. were supported by graduate scholarships from the Fonds de Recherche du Québec—Nature et Technologies. V.S. was supported by a post-doctoral fellowship from the Swiss National Science Foundation under grant P500PB_214419. R.S. and S.M. are thankful to PROTEO for funding through their new initiative program. S.M. also acknowledges funding by the Natural Sciences and Engineering Research Council of Canada (Discovery and Alliance Mission programs). S.M. holds a T1 Canada Research Chair in Bacteriophages. This research used resources of the Advanced Photon Source, a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under contract no. DE-AC02-06CH11357. Use of the Lilly Research Laboratories Collaborative Access Team (LRL-CAT) beamline at Sector 31 of the Advanced Photon Source was provided by Eli Lilly Company, which operates the facility.
Contributor Information
Frank Oechslin, Département de biochimie, de microbiologie, et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec City, Canada; Institut de Biologie Intégrative et des Systèmes (IBIS), Pavillon Charles-Eugène-Marchand, Université Laval, Quebec City, Canada; Groupe de recherche en écologie buccale, Faculté de médecine dentaire, Université Laval, Québec City, Canada.
Xiaojun Zhu, Département de biochimie, de microbiologie, et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec City, Canada; Institut de Biologie Intégrative et des Systèmes (IBIS), Pavillon Charles-Eugène-Marchand, Université Laval, Quebec City, Canada.
Carlee Morency, Département de biochimie, de microbiologie, et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec City, Canada; Institut de Biologie Intégrative et des Systèmes (IBIS), Pavillon Charles-Eugène-Marchand, Université Laval, Quebec City, Canada; Groupe de recherche en écologie buccale, Faculté de médecine dentaire, Université Laval, Québec City, Canada.
Vincent Somerville, Département de biochimie, de microbiologie, et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec City, Canada; Institut de Biologie Intégrative et des Systèmes (IBIS), Pavillon Charles-Eugène-Marchand, Université Laval, Quebec City, Canada; Groupe de recherche en écologie buccale, Faculté de médecine dentaire, Université Laval, Québec City, Canada; Department of Microbiology & Immunology, Faculty of Medicine and Health Sciences, McGill University, Montreal, Quebec, Canada.
Rong Shi, Département de biochimie, de microbiologie, et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec City, Canada; Institut de Biologie Intégrative et des Systèmes (IBIS), Pavillon Charles-Eugène-Marchand, Université Laval, Quebec City, Canada.
Sylvain Moineau, Département de biochimie, de microbiologie, et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec City, Canada; Institut de Biologie Intégrative et des Systèmes (IBIS), Pavillon Charles-Eugène-Marchand, Université Laval, Quebec City, Canada; Groupe de recherche en écologie buccale, Faculté de médecine dentaire, Université Laval, Québec City, Canada; Félix d’Hérelle Reference Center for Bacterial Viruses, Université Laval, Québec City, Canada.
Data Availability
The data underlying this article are available in the article and in its online Supplementary material.
References
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997:25(17):3389–3402. 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anantharaman V, Aravind L. Evolutionary history, structural features and biochemical diversity of the NlpC/P60 superfamily of enzymes. Genome Biol. 2003:4(2):R11. 10.1186/gb-2003-4-2-r11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, Wang J, Cong Q, Kinch LN, Schaeffer RD, et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021:373(6557):871–876. 10.1126/science.abj8754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bajorath J, Raghunathan S, Hinrichs W, Saenger W. Long-range structural changes in proteinase K triggered by calcium ion removal. Nature. 1989:337(6206):481–484. 10.1038/337481a0. [DOI] [PubMed] [Google Scholar]
- Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012:19(5):455–477. 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boisvert S, Laviolette F, Corbeil J. Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J Comput Biol. 2010:17(11):1519–1533. 10.1089/cmb.2009.0238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014:30(15):2114–2120. 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolotin A, Quinquis B, Renault P, Sorokin A, Ehrlich SD, Kulakauskas S, Lapidus A, Goltsman E, Mazur M, Pusch GD, et al. Complete sequence and comparative genome analysis of the dairy bacterium Streptococcus thermophilus. Nat Biotechnol. 2004:22(12):1554–1558. 10.1038/nbt1034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen Y, Simmonds RS, Young JK, Timkovich R. Solution structure of the recombinant target recognition domain of zoocin A. Proteins. 2013:81(4):722–727. 10.1002/prot.24224. [DOI] [PubMed] [Google Scholar]
- Clokie MR, Millard AD, Letarov AV, Heaphy S. Phages in nature. Bacteriophage. 2011:1(1):31–45. 10.4161/bact.1.1.14942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Baets G, Van Durme J, Reumers J, Maurer-Stroh S, Vanhee P, Dopazo J, Schymkowitz J, Rousseau F. SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants. Nucleic Acids Res. 2012:40(D1):935–939. 10.1093/nar/gkr996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Delorme C. Safety assessment of dairy microorganisms: Streptococcus thermophilus. Int J Food Microbiol. 2008:126(3):274–277. 10.1016/j.ijfoodmicro.2007.08.014. [DOI] [PubMed] [Google Scholar]
- Delorme C, Legravet N, Jamet E, Hoarau C, Alexandre B, El-Sharoud WM, Darwish MS, Renault P. Study of Streptococcus thermophilus population on a world-wide and historical collection by a new MLST scheme. Int J Food Microbiol. 2017:242:70–81. 10.1016/j.ijfoodmicro.2016.11.016. [DOI] [PubMed] [Google Scholar]
- De Vos WM. Gene cloning and expression in lactic streptococci. FEMS Microbiol Lett. 1987:46(3):281–295. 10.1016/0378-1097(87)90113-3. [DOI] [Google Scholar]
- Di Cagno R, Gobbetti M. Lactic acid bacteria | Lactobacillus spp.: Lactobacillus helveticus. In: Fuquay JW, editor. Encyclopedia of dairy sciences (second edition). San Diego: Academic Press; 2011. p. 105–110. [Google Scholar]
- Dion MB, Oechslin F, Moineau S. Phage diversity, genomics and phylogeny. Nat Rev Microbiol. 2020:18(3):125–138. 10.1038/s41579-019-0311-5. [DOI] [PubMed] [Google Scholar]
- Duplessis M, Russell WM, Romero DA, Moineau S. Global gene expression analysis of two Streptococcus thermophilus bacteriophages using DNA microarray. Virology. 2005:340(2):192–208. 10.1016/j.virol.2005.05.033. [DOI] [PubMed] [Google Scholar]
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004:32(5):1792–1797. 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004:60(12):2126–2132. 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]
- Fernandes S, Sao-Jose C. Enzymes and mechanisms employed by tailed bacteriophages to breach the bacterial cell barriers. Viruses. 2018:10(8):396. 10.3390/v10080396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fernández-Ruiz I, Coutinho FH, Rodriguez-Valera F. Thousands of novel endolysins discovered in uncultured phage genomes. Front Microbiol. 2018:9:1033. 10.3389/fmicb.2018.01033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fischetti VA. Bacteriophage lysins as effective antibacterials. Curr Opin Microbiol. 2008:11(5):393–400. 10.1016/j.mib.2008.09.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Foley S, Bruttin A, Brüssow H. Widespread distribution of a group I intron and its three deletion derivatives in the lysin gene of Streptococcus thermophilus bacteriophages. J Virol. 2000:74(2):611–618. 10.1128/JVI.74.2.611-618.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012:28(23):3150–3152. 10.1093/bioinformatics/bts565. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garneau JE, Moineau S. Bacteriophages of lactic acid bacteria and their impact on milk fermentations. Microb Cell Fact. 2011:2:149–158. 10.1186/1475-2859-10-S1-S20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv. 2012. 10.48550/arXiv.1207.3907. [DOI] [Google Scholar]
- Gibson DG, Young L, Chuang R-Y, Venter JC, Hutchison CA, Smith HO. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat Methods. 2009:6(5):343–345. 10.1038/nmeth.1318. [DOI] [PubMed] [Google Scholar]
- Gobbetti M, Calasso M. Streptococcus | Introduction. In: Batt CA, Tortorello ML, editors. Encyclopedia of food microbiology (second edition). Oxford: Academic Press; 2014. p. 535–553. [Google Scholar]
- Hambly E, Suttle CA. The viriosphere, diversity, and genetic exchange within phage communities. Curr Opin Microbiol. 2005:8(4):444–450. 10.1016/j.mib.2005.06.005. [DOI] [PubMed] [Google Scholar]
- Henzler-Wildman K, Kern D. Dynamic personalities of proteins. Nature. 2007:450(7172):964–972. 10.1038/nature06522. [DOI] [PubMed] [Google Scholar]
- Hermoso JA, Monterroso B, Albert A, Galan B, Ahrazem O, Garcia P, Martinez-Ripoll M, Garcia JL, Menendez M. Structural basis for selective recognition of pneumococcal cell wall by modular endolysin from phage Cp-1. Structure. 2003:11(10):1239–1249. 10.1016/j.str.2003.09.005. [DOI] [PubMed] [Google Scholar]
- Hols P, Hancy F, Fontaine L, Grossiord B, Prozzi D, Leblond-Bourget N, Decaris B, Bolotin A, Delorme C, Dusko Ehrlich S, et al. New insights in the molecular biology and physiology of Streptococcus thermophilus revealed by comparative genomics. FEMS Microbiol Rev. 2005:29(3):435–463. 10.1016/j.femsre.2005.04.008. [DOI] [PubMed] [Google Scholar]
- Hynes AP, Lemay ML, Trudel L, Deveau H, Frenette M, Tremblay DM, Moineau S. Detecting natural adaptation of the Streptococcus thermophilus CRISPR-Cas systems in research and classroom settings. Nat Protoc. 2017:12(3):547–565. 10.1038/nprot.2016.186. [DOI] [PubMed] [Google Scholar]
- Kabsch W. XDS. Acta Crystallogr D Biol Crystallogr. 2010:66(2):125–132. 10.1107/S0907444909047337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim B, Wang Y-C, Hespen CW, Espinosa J, Salje J, Rangan KJ, Oren DA, Kang JY, Pedicord VA, Hang HC. Enterococcus faecium secreted antigen A generates muropeptides to enhance host immunity and limit bacterial pathogenesis. eLife. 2019:8:e45343. 10.7554/eLife.45343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirberger M, Yang JJ. Calcium-binding protein site types. In: Kretsinger RHUversky VN, Permyakov EA, editors. Encyclopedia of metalloproteins. New York, NY: Springer New York; 2013. p. 511–521. [Google Scholar]
- Kropinski AM, Mazzocco A, Waddell TE, Lingohr E, Johnson RP. Enumeration of bacteriophages by double agar overlay plaque assay. Methods Mol Biol. 2009:501:69–76. 10.1007/978-1-60327-164-6_7. [DOI] [PubMed] [Google Scholar]
- Lange OF, Lakomek NA, Farès C, Schröder GF, Walter KF, Becker S, Meiler J, Grubmüller H, Griesinger C, de Groot BL. Recognition dynamics up to microseconds revealed from an RDC-derived ubiquitin ensemble in solution. Science. 2008:320(5882):1471–1475. 10.1126/science.1157092. [DOI] [PubMed] [Google Scholar]
- Langer G, Cohen SX, Lamzin VS, Perrakis A. Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7. Nat Protoc. 2008:3(7):1171–1179. 10.1038/nprot.2008.91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr. 1993:26(2):283–291. 10.1107/S0021889892009944. [DOI] [Google Scholar]
- Lee WJ, Lucey JA. Structure and physical properties of yogurt gels: effect of inoculation rate and incubation temperature. J Dairy Scie. 2004:87(10):3153–3164. 10.3168/jds.S0022-0302(04)73450-5. [DOI] [PubMed] [Google Scholar]
- Letunic I, Bork P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021:9(W1):293–296. 10.1093/nar/gkab301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewis MJ. The measurement and significance of ionic calcium in milk—a review. Int J Dairy Technol. 2011:64(1):1–13. 10.1111/j.1471-0307.2010.00639.x. [DOI] [Google Scholar]
- Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009:25(14):1754–1760. 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li TT, Tian WL, Gu CT. Elevation of Lactococcus lactis subsp. cremoris to the species level as Lactococcus cremoris sp. nov. and transfer of Lactococcus lactis subsp. tructae to Lactococcus cremoris as Lactococcus cremoris subsp. tructae comb. nov. Int J Syst Evol Microbiol. 2021:71(3). 10.1099/ijsem.0.004727. [DOI] [PubMed] [Google Scholar]
- Lin DM, Koskella B, Lin HC. Phage therapy: an alternative to antibiotics in the age of multi-drug resistance. World J Gastrointest Pharmacol Ther. 2017:8(3):162–173. 10.4292/wjgpt.v8.i3.162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu S-Q, Tao Y, Meng Z-H, Fu Y-X, Zhang K-Q. The effect of calciums on molecular motions of proteinase K. J Mol Model. 2011:17(2):289–300. 10.1007/s00894-010-0724-6. [DOI] [PubMed] [Google Scholar]
- Loessner MJ, Kramer K, Ebel F, Scherer S. C-terminal domains of Listeria monocytogenes bacteriophage murein hydrolases determine specific recognition and high-affinity binding to bacterial cell wall carbohydrates. Mol Microbiol. 2002:44(2):335–349. 10.1046/j.1365-2958.2002.02889.x. [DOI] [PubMed] [Google Scholar]
- Marcó MB, Moineau S, Quiberoni A. Bacteriophages and dairy fermentations. Bacteriophage. 2012:2(3):149–158. 10.4161/bact.21868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCoy AJ, Grosse-Kunstleve RW, Adams PD, Winn MD, Storoni LC, Read RJ. Phaser crystallographic software. J Appl Crystallogr. 2007:40(4):658–674. 10.1107/S0021889807021206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Milles LF, Unterauer EM, Nicolaus T, Gaub HE. Calcium stabilizes the strongest protein fold. Nat Commun. 2018:9(1):4764. 10.1038/s41467-018-07145-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold: making protein folding accessible to all. Nat Methods. 2022:19(6):679–682. 10.1038/s41592-022-01488-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moineau S, Pandian S, Klaenhammer TR. Evolution of a lytic bacteriophage via DNA acquisition from the Lactococcus lactis chromosome. Appl Environ Microbiol. 1994:60(6):1832–1841. 10.1128/aem.60.6.1832-1841.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Müller A, Hinrichs W, Wolf WM, Saenger W. Crystal structure of calcium-free proteinase K at 1.5-A resolution. J Biol Chem. 1994:269(37):23108–23111. 10.1016/S0021-9258(17)31626-5. [DOI] [PubMed] [Google Scholar]
- Murshudov GN, Vagin AA, Dodson EJ. Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr D Biol Crystallogr. 1997:53(3):240–255. 10.1107/S0907444996012255. [DOI] [PubMed] [Google Scholar]
- Oechslin F, Zhu X, Dion MB, Shi R, Moineau S. Phage endolysins are adapted to specific hosts and are evolutionarily dynamic. PLoS Biol. 2022:20(8):e3001740. 10.1371/journal.pbio.3001740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oliveira H, Melo LD, Santos SB, Nóbrega FL, Ferreira EC, Cerca N, Azeredo J, Kluskens LD. Molecular aspects and comparative genomics of bacteriophage endolysins. J Virol. 2013:87(8):4558–4570. 10.1128/JVI.03277-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Passerini D, Beltramo C, Coddeville M, Quentin Y, Ritzenthaler P, Daveran-Mingot M-L, Le Bourgeois P. Genes but not genomes reveal bacterial domestication of Lactococcus lactis. PLoS One. 2010:5(12):e15306. 10.1371/journal.pone.0015306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Philippe C, Levesque S, Dion MB, Tremblay DM, Horvath P, Lüth N, Cambillau C, Franz C, Neve H, Fremaux C, et al. Novel genus of phages infecting Streptococcus thermophilus: genomic and morphological characterization. Appl Environ Microbiol. 2020:86(13):e00227-20. 10.1128/AEM.00227-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rigden DJ, Jedrzejas MJ, Galperin MY. Amidase domains from bacterial and phage autolysins define a family of gamma-D, L-glutamate-specific amidohydrolases. Trends Biochem Sci. 2003:28(5):230–234. 10.1016/S0968-0004(03)00062-8. [DOI] [PubMed] [Google Scholar]
- Schmelcher M, Donovan DM, Loessner MJ. Bacteriophage endolysins as novel antimicrobials. Future Microbiol. 2012:7(10):1147–1171. 10.2217/fmb.12.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Söding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 2005:33(Web Server):W244–W248. 10.1093/nar/gki408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vázquez R, García E, García P. Sequence-function relationships in phage-encoded bacterial cell wall lytic enzymes and their implications for phage-derived products design. J Virol. 2021:95(14):e0032121. 10.1128/JVI.00321-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vollmer W, Blanot D, De Pedro MA. Peptidoglycan structure and architecture. FEMS Microbiol Rev. 2008:32(2):149–167. 10.1111/j.1574-6976.2007.00094.x. [DOI] [PubMed] [Google Scholar]
- Vuillemard J-C. Science et technologie du lait. 3e édition. Quebec City (QC): Les Presses de l’Université Laval; 2018. [Google Scholar]
- Wang IN, Smith DL, Young R. Holins: the protein clocks of bacteriophage infections. Annu Rev Microbiol. 2000:54(1):799–825. 10.1146/annurev.micro.54.1.799. [DOI] [PubMed] [Google Scholar]
- Winn MD, Ballard CC, Cowtan KD, Dodson EJ, Emsley P, Evans PR, Keegan RM, Krissinel EB, Leslie AG, McCoy A, et al. Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr. 2011:67(4):235–242. 10.1107/S0907444910045749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu C, Gunasekaran S. A systems analysis of pasta filata process during mozzarella cheese making. J Food Eng. 2005:69(4):399–408. 10.1016/j.jfoodeng.2004.08.031. [DOI] [Google Scholar]
- Zheng H, Cooper DR, Porebski PJ, Shabalin IG, Handing KB, Minor W. CheckMyMetal: a macromolecular metal-binding validation tool. Acta Crystallogr D Struct Biol. 2017:73(3):223–233. 10.1107/S2059798317001061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou B, Zhen X, Zhou H, Zhao F, Fan C, Perčulija V, Tong Y, Mi Z, Ouyang S. Structural and functional insights into a novel two-component endolysin encoded by a single gene in Enterococcus faecalis phage. PLoS Pathog. 2020:16(3):e1008394. 10.1371/journal.ppat.1008394. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data underlying this article are available in the article and in its online Supplementary material.




