Abstract
Genome modifications are central components of the continuous arms race between viruses and their hosts. The archaeosine base (G+), which was thought to be found only in archaeal tRNAs, was recently detected in genomic DNA of Enterobacteria phage 9g and was proposed to protect phage DNA from a wide variety of restriction enzymes. In this study, we identify three additional 2′-deoxy-7-deazaguanine modifications, which are all intermediates of the same pathway, in viruses: 2′-deoxy-7-amido-7-deazaguanine (dADG), 2′-deoxy-7-cyano-7-deazaguanine (dPreQ0) and 2′-deoxy-7- aminomethyl-7-deazaguanine (dPreQ1). We identify 180 phages or archaeal viruses that encode at least one of the enzymes of this pathway with an overrepresentation (60%) of viruses potentially infecting pathogenic microbial hosts. Genetic studies with the Escherichia phage CAjan show that DpdA is essential to insert the 7-deazaguanine base in phage genomic DNA and that 2′-deoxy-7-deazaguanine modifications protect phage DNA from host restriction enzymes.
Subject terms: Biochemistry, Chemical biology, Computational biology and bioinformatics, Ecology, Evolution
Viral genomic DNA is often modified to evade the host bacterial restriction system. Here the authors identified 2′-deoxy-7-deazaguanine modifications on phage DNA by comparative genomics and experimental validation, showing their role in genome protection.
Introduction
In the continuous battle between bacteria and phages, both entities are constantly evolving defenses and counterattack mechanisms1–5. To escape these defenses, phages have developed multiple strategies6–8, and one of the most widespread strategy is to modify their DNA. For example, the genomic DNA of Escherichia coli phage T4 contains the nucleobase glucosyl-hydroxymethylcytosine, which inhibits the restriction–modification (RM) and clustered regularly interspaced short palindromic repeat (CRISPR)–CRISPR-associated (Cas) systems9. The increased availability of complete phage genome sequences has led to recent discoveries of novel complex DNA modifications, such as 2′-deoxy-5-hydroxymethyluracil derivatives in Pseudomonas phage M6, Salmonella phage Vil, and Deftia phage phi W-1410 and 2′-deoxyarcheosine (dG+) in Enterobacteria phage 9g11.
Two 7-deazaguanine modifications, 2′-deoxy-7-amido-7-deazaguanosine (dADG) and the 2′-deoxyribonucleoside analog of archaeosine, which were previously thought to be present only in tRNA as queuosine (Q) in bacteria and archaeosine (G+) in archaea, were recently discovered in bacteria and phage DNA, respectively, by combining in silico data mining and experimental validation11. As shown in Fig. 1, 7-cyano-7-deazaguanine (preQ0) is synthesized from GTP by four enzymes (FolE, QueD, QueE, QueC) and is the key intermediate in both the Q and G+ pathways12–14. tRNA-guanine-transglycosylases (TGT in bacteria, arcTGT in archaea) are the signature enzymes in the Q and G+ tRNA modification pathways, as they exchange the targeted guanines with 7-deazaguanine precursors. In archaea, preQ0 is directly incorporated into tRNA by arcTGT before being further modified by different types of amidotransferases (ArcS, Gat-QueC, or QueF-L)15–17. In bacteria, preQ0 is reduced to 7-aminomethyl-7-deazaguanine (preQ1) by QueF18 before TGT incorporates it in tRNA19, where it is further modified to Q in two steps20–22 (Fig. 1).
The presence of homologs of Q synthesis genes has long been reported in phage genomes23–26. However, the role of these genes in DNA modification rather than in RNA modification was only recently postulated. Indeed, TGT paralogs (now called DpdA) were found to be involved in modifying DNA in specific bacteria and phage genomes. In bacteria, the dpdA gene is often located in a cluster of over ten genes that encode a RM system that inserts ADG into DNA and prevents replication of unmodified DNA11,27. In Enterobacteria phage 9g28, dpdA is associated with G+ synthesis, and up to 27% of the dG in this phage is replaced by dG+11,29. This modification is proposed to play an anti-restriction role28 because 7-deazaguanine derivatives can block the activity of a wide variety of restriction enzymes without inhibiting the activity of the polymerases needed for phage DNA replication30.
Building on the discovery of dG+ in Enterobacteria phage 9g, we systematically explore the genomes of other phages for potential pathways involved in 7-deazaguanine insertion in DNA and experimentally validate a subset. This work reveals a much greater diversity in the 7-deazaguanine modifications and their corresponding pathways than anticipated. Moreover, we show that 7-deazaguanine derivatives have been hijacked by phages to evade RM systems.
Results
Phage 9g encodes functional preQ0 synthesis genes
The expression of folE, queD, and queE from Enterobacteria phage 9g in trans in E. coli MG1655 ΔfolE, ΔqueD, and ΔqueE strains, respectively, successfully re-established the production of Q, demonstrating the isofunctionality of the tested pairs (Fig. 2a). This complementation was not observed when the viral gat-queC and dpdA genes were expressed in E. coli ΔqueC and Δtgt, respectively. The result was expected for dpdA, as dpdA was predicted to encode an enzyme that recognizes DNA and not tRNA11,31. This result was unexpected for gat-queC, as we had previously shown that expression of an archaeal gat-queC homolog in E. coli could lead to G+ in tRNA and hence the formation of a preQ0 intermediate16.
Phage 9g Gat-QueC and DpdA insert G+ DNA
As E. coli encodes the entire preQ0 biosynthesis pathway, we predicted that the dual expression of the viral gat-queC and dpdA genes in trans would lead to the insertion of 7-deazaguanine derivatives, such as dG+, in E. coli DNA. Because the presence of dG+ confers resistance to EcoRI digestion29, we used restriction profiles as a first indication for the presence of modifications in plasmid DNA. The two phage genes were both cloned into pBAD24 and pBAD33. EcoRI cuts pBAD24 once and pBAD33 twice, as shown in the digestion profiles of plasmids extracted from E. coli cotransformed with the two empty plasmids (Fig. 2b, c, lane 1). Because the gat-queC and dpdA genes of phage 9g lack EcoRI sites, the restriction profiles of plasmids extracted from E. coli derivatives cotransformed with an empty plasmid and a plasmid containing one of the two genes are shifted by the insert sizes (Fig. 2c, lanes 2, 3, 5, and 6). An additional band corresponding to the uncut plasmid was observed only for plasmids extracted from strains expressing both gat-queC and dpdA genes (Fig. 2c, lanes 4 and 7, and Fig. 2b, white arrows). As a supplemental control, we digested the same combination of plasmids with PsiI (TTA^TAA) and EcoRI (Supplementary Fig. 1). The single digestion by PsiI linearized all these plasmids, and the plasmids encoding both dpdA and gat-queC of phage 9g were again partially resistant to EcoRI digestion (red arrows in Supplementary Fig. 1).
Analysis of dG+, dADG, dPreQ0, and dPreQ1 profiles by liquid chromatography-coupled triple quadrupole mass spectrometry (LC-MS/MS, quantification results in Table 1, mean ± standard deviation based on two or three replicates) revealed that plasmid DNA extracted from strains expressing only dpdA contained dPreQ0, with 790 ± 8 modifications per 106 nucleotides; 0.316 ± 0.0032% of the Gs, when expressed in pBAD24; and 84 ± 26 modifications per 106 nucleotides, 0.0336 ± 0.0104% of the Gs, when expressed in pBAD33. dG+ was detected in this strain just above the detection limit as well (6.5 ± 0.5 modifications per 106 nucleotides, 0.0026 ± 0.0002% of the Gs). Plasmid DNA extracted from strains expressing dpdA and gat-queC contained dG+, with 45,000 ± 25,000 modifications per 106 nucleotides, 18 ± 10% of the Gs, when DpdA was expressed in pBAD24 and Gat-QueC was expressed in pBAD33 and 22,750 ± 17,250 modifications per 106 nucleotides, 9.1 ± 7% of the Gs, when reversed. dPreQ0 was also detected when gat-queC was expressed at lower levels than dpdA, (77 ± 7 modifications per 106 nucleotides, 0.0308 ± 0.0028% of the Gs). No modifications were detected in strains harboring empty plasmids or when only Gat-QueC was expressed (Table 1). Taken together, these results showed that dG+ but not preQ0 confers resistance to EcoRI and that the phage 9g pathway that inserts dG+ in its viral DNA can be transferred to modify E. coli genomic DNA.
Table 1.
Lane in Fig. 2b | Background | 9g gene in pBAD24 | 9g gene in pBAD33 | dADG per 106 nt | dPreQ0 per 106 nt | dPreQ1 per 106 nt | dCDG per 106 nt | dG+ per 106 nt |
---|---|---|---|---|---|---|---|---|
1 | MG1655 | None | None | <6 | <6 | <6 | <6 | <6 |
2 | MG1655 | dpdA | None | <6 | 790 ± 8 | <6 | <6 | <6 |
3 | MG1655 | None | gat-queC | <6 | <6 | <6 | <6 | <6 |
4 | MG1655 | dpdA | gat-queC | <6 | 77 ± 7* | <6 | <6 | 45,000 ± 25,000 |
5 | MG1655 | gat-queC | None | <6 | <6 | <6 | <6 | <6 |
6 | MG1655 | None | dpdA | <6 | 84 ± 26 | <6 | <6 | 6.5 ± 0.5 |
7 | MG1655 | gat-queC | dpdA | <6 | <6 | <6** | <6** | 22,750 ± 17,250 |
8 | MG1655 ΔqueC | dpdA | gat-queC | <6** | <6** | <6** | <6** | 13,750** |
9 | MG1655 ΔqueC | gat-queC | dpdA | <6 | <6 | <6 | <6 | 23,000 ± 17,000 |
All values represent the mean ± deviation of the mean for two analyses, except asterisk (*), mean ± standard deviation for three replicate analyses, and double asterisks (**), single analysis
Interestingly, whereas we had failed to complement the Q− phenotype of the E. coli ΔqueC strain when expressing the gat-queC gene of phage 9g, the EcoRI resistance phenotype caused by 7-deazaguanine insertion in strains expressing both dpdA and gat-queC of phage 9g was still observed in a ΔqueC background (Fig. 2c, lanes 8 and 9) but not in a ΔqueD background (Fig. 2c, lanes 10 and 11). Furthermore, only dG+ modification was observed in the DNA of the ΔqueC strains by LC-MS/MS (Table 1), with similar amounts as in the wild type (WT; 13,750 modifications per 106 nucleotides, 5.5% of the Gs, and 23,000 ± 17,000 modifications per 106 nucleotides, 9.2 ± 7% of the Gs). This suggests that the Gat-QueC protein can produce preQ0 but that it is channeled to the putative DNA-modifying enzyme DpdA and not to the tRNA-modifying pathway enzyme QueF.
Finally, we tested whether the E. coli TGT was required for DpdA activity in E. coli, as the active forms of TGT enzymes are known to be dimers31. This did not seem to be the case, as the restriction resistance phenotype was still observed in the Δtgt background (Fig. 2c, lanes 12 and 13).
A wide variety of phages encode dG+ synthesis proteins
We identified another subfamily of DpdA, renamed DpdA2, encoded by the Vibrio phage nt-1 by investigating genes flanking the preQ0 biosynthesis gene cluster. Indeed, DpdA2 (YP_008125322) of phage nt-1 is not detected when using Enterobacteria phage 9g DpdA as a query in PSI-BLAST. This DpdA2 family does not possess the conserved histidine found at position 19611. However, some similarities with members of the TGT family were detected using HHpred, with a confidence score of 100%.
An in silico search for phages that could harbor 7-deazaguanine derivatives in their genomic DNA revealed a total of 182 viruses deposited in GenBank that were found to encode a DpdA/DpdA2 homolog and/or at least a G+ synthesis gene (Supplementary Data 1). Most of these viruses (163/182) were bacteriophages, while 16 were archaeal viruses and 3 were eukaryotic viruses. The eukaryotic viruses only encode FolE, which is most likely linked to the folate pathway32. Analyses of the presence/absence patterns of the predicted Q/G+ biosynthesis genes led to a classification of these viruses into various groups and, in some cases, predicted the nature of the 7-deazaguanine base modification. It is important to note that no homologs to the proteins specifically involved in Q biosynthesis, such as QueA, QueG, or QueH (see Fig. 1), were found in the viruses analyzed.
The first group contains 25 phages and is represented by Enterobacteria phage 9g (KJ419279), Streptococcus phage Dp-1 (NC_015274), and Vibrio phage nt-1 (NC_021529) in Fig. 3. These phages encode homologs of 9g DpdA or nt-1 DpdA2 as well as of FolE, QueD, QueE, and QueC. In addition, they encode homologs of one of the three amidotransferases involved in the last steps of G+ synthesis: ArcS15, QueF-L16 (or QueF), or a glutamine amidotransferase (Gat) domain fused to the canonical QueC16. These phages likely modify their DNA with dG+, as does phage 9g11. It should be noted that the discrimination between the QueF-L homologs, predicted to produce the G+ base from preQ0, and QueF homologs, predicted to produce preQ1 from preQ0, is difficult to establish based only on sequence similarity. Therefore, the phages encoding these proteins might harbor dG+ or dPreQ1 (or both). Of note, this viral group includes a Pseudomonas aeruginosa phage that was isolated; the genome of this phage was sequenced in this study, and the phage was named Pseudomonas phage Quinobequin P09 (description in Supplementary Information).
The second group includes 40 phages and is represented by E. coli phage CAjan (NC_028776) and Mycobacterium phage Rosebush (AY129334) in Fig. 3. These phages encode a homolog of one of the two types of DpdA and of the preQ0 synthesis enzymes (FolE, QueD, QueE, and QueC), but they are missing an amidotransferase. As such, we predicted that these phages modify their DNA with preQ0 or ADG, similar to the bacteria that contain the dpd cluster11. Mycobacterium phage Bipper (KU728633), which is only missing a gene coding for QueC, was added to this group even if it could be modified by the QueC substrate (7-carboxy-7-deazaguanine, see Fig. 1). The Uncultured phage clone 7AX_2 (MF417872) was also added to this group because it lacks queC, although this may be due to the incomplete genome sequence of this phage. In addition, we cannot exclude that this phage encodes an amidotransferase.
The third group is currently the largest, as it contains 76 phages, including Salmonella phage 7–11 (NC_015938) and Mycobacterium phage Orion (DQ398046), as shown in Fig. 3. These phages encode DpdA but no G+ or preQ0 biosynthesis protein homologs. At this stage, their genome modification status, if any, is difficult to predict. Phages in this group could rely on preQ0 synthesized by the host or on the uptake of exogenous 7-deazapurine precursors. Some phages do encode homologs of YhhQ, the preQ0 transporter33, but there is no correlation with any specific group of phages. The large size of this group compared to the others might be caused by the relatively large number of Mycobacteriophages in the Virus database due to the massive phage isolation and sequencing effort of PhagesDB and the SEA-PHAGES project34.
The last group is composed of 48 phages encoding proteins of the preQ0/G+ pathway but not DpdA. These phages could boost the production of the Q precursor to increase the level of Q in the host tRNA and increase translation efficiency35. However, it is possible that 7-deazaguanines are inserted in their DNA in a DpdA-independent pathway, as there is a recent report that the genomes of Campylobacter phages of this group are highly modified by dADG36. Similarly, the Halovirus HVTV-1 (NC_020158), presented in Fig. 3, may have found another way to insert the modifications and should harbor either dPreQ1 or dG+, as it encodes the QueF, or QueF-like, protein.
Phages containing FolE and QueC singletons were discarded from further analysis because FolE is shared between folate and preQ0 synthesis13, while QueC is also part of a superfamily of ATPases37, making their precise role difficult to identify.
All the phages identified above are members of the Caudovirales order and are distributed into various families: Siphoviridae (95), Myoviridae (23), Ackermannviridae (20), and Podoviridae (3). For the Archaeal viruses, we identified 12 members of the Ligamenvirales order and 2 of the Bicaudaviridae family (Supplementary Data 2).
Detailed analysis of phage 7-deazaguanine synthesis proteins
To evaluate the isofunctionality of the studied protein families, sequence similarity networks (SSNs) were generated. Proteins in the same cluster should share the same function38. Several of the 7-deazaguanine biosynthesis proteins are part of protein families that are known to harbor subgroups with different functions that could impede functional annotations using only PSI-BLAST scores or HMM models, hence the use of SSNs to strengthen the annotation process.
As shown in Fig. 4a, phage DpdA proteins do not cluster with the TGT proteins from the three major kingdoms nor with the bacterial DpdA proteins identified previously11. Phage DpdA clearly separate in four subgroups. One contains the DpdA found in phages that encode the complete set of G+ or preQ0 synthesis proteins. The second and third groups are composed of singleton DpdA proteins, and the fourth group is composed of DpdA2 proteins. The singleton DpdAs are clustered in phages that infect the same clade of bacteria (Mycobacterium and γ-Proteobacteria). This could be a sign of a rapid divergence of this protein subfamily, and more studies will be required to determine whether this subset of DpdA proteins has functionally diverged.
Most phage QueC proteins do not cluster with bacterial QueC proteins when the BLAST threshold score is sufficient to separate QueC from the Gat-QueC groups (Fig. 4b). However, when a lower threshold score is used, the QueC and Gat-QueC proteins can be connected (Supplementary Fig. 2A). This is not the case for the QueC proteins encoded as singletons in phages, such as Bacillus phage SP-15 and Salmonella phage SFP10 (Supplementary Data 1), suggesting that even though the proteins were identified as QueC by HHpred, they may be part of a functionally unrelated subgroup of the N-type ATP pyrophosphatases superfamily37. Finally, phage and archaeal Gat-QueC proteins form a single cluster, strengthening their functional association.
HHpred predicted that the QueF family proteins encoded by phages are, for most of them, closer to the archaeal QueF-L proteins than to the bacterial QueF proteins (see Supplementary Data 1). However, they clustered with bacterial QueF proteins in the SSNs (Fig. 4c). Further experimental studies are required to determine whether the phage QueF proteins are nitrile reductases or amidotransferases (Fig. 1).
SNNs for the FolE, QueD, QueE, and ArcS families are shown in Supplementary Fig. 2B–E. The phage proteins cluster nicely with their bacterial and archaeal homologs, reinforcing the initial functional annotations.
The host may participate in phage DNA modification
To study the interaction between phages containing 7-deazaguanine-related genes and their bacterial hosts, we gathered metadata on the hosts and their habitat using RefSeq39 and the Globi database40 and analyzed the distribution of Q, G+, and dADG synthesis genes in these organisms (see Supplementary Data 2 and 3). Interestingly, 106 of the collected phages (~60%) infect a host strain that is the model for a known bacterial pathogen (Supplementary Data 2), where only ~9% of all the double-stranded DNA (dsDNA) viruses from the Virus-Host database41 infect a strain related to pathogens (data not shown), making our sample six to seven times more enriched compared to a random sampling. No clear environment was found for the archaeal hosts.
All phage hosts predicted to modify their DNA with G+ possess the pathway to produce Q in tRNA. Curiously, the hosts of phages coding for a QueF-L and a 9g DpdA homolog do not encode the preQ0 biosynthetic pathway (QueDEC, see Fig. 1) but encode the specific preQ0 transporter YhhQ33 and the rest of the Q pathway (QueFAG and TGT, Fig. 1). Conversely, all the hosts of the DpdA2-encoding phages encode the full Q pathway.
There is no clear pattern for the bacterial hosts of phages encoding both DpdA and the whole preQ0 pathway. Most of them encode the full Q pathway enzymes except for Streptococcus pneumoniae, which lacks the preQ0 pathway genes; Rhodococcus erythropolis, which encodes only TGT; and Mycobacteria, which possess none of these genes.
The hosts of the phages encoding only DpdA also encode the full set of Q synthesis enzymes except the Clostridium species, which lack the preQ0 pathway genes, and the Mycobacterium genus, which possesses none of these genes. Sulfolobi were not referenced in PubSEED42, but by performing a BLASTp search with default parameters and the genes listed in Supplementary Table 1 as queries, we identified all G+ pathway genes (Supplementary Table 2). Hence, the 7-deazaguanine intermediates produced by these hosts, Clostridium and Mycobacterium excluded, might be used by phages that lack the biosynthesis proteins to produce a 7-deazaguanine precursor.
Finally, the hosts of the phages that do not encode a DpdA homolog but encode the preQ0 pathway proteins all encode the full Q synthesis pathway.
A few bacterial hosts, such as 46 different strains of E. coli, Haloarcula vallismortis, and Vibrio harveyi 1DA3, also harbor homologs of the bacterial DpdA, which are known to modify bacterial DNA by either dPreQ0 or dADG11.
Different 7-deazaguanine modifications in distinct phages
To test our predictions on the nature of phage DNA modifications, a set of phages from each group were selected (Fig. 3), and their genomic DNAs were extracted for mass spectrometric analysis (Table 2, mean ± standard deviation based on two replicates). No 2′-deoxyqueuosine (dQ) was found in any of the tested samples, correlating with the fact that no phage or virus encodes the specific protein for Q synthesis (QueAGH).
Table 2.
Phage/virus Accession # | Phage/virus name | Phage/virus GC content | Prediction based on gene content | dPreQ0 per 106 nt | dADG per 106 nt | dG+ per 106 nt | dPreQ1 per 106 nt | dQ per 106 nt |
---|---|---|---|---|---|---|---|---|
NC_028776 | Escherichia phage CAjan | 44.70% | dPreQ0 | 70,628 ± 2445 | <6 | <6 | <6 | <6 |
None | Escherichia phage CAjan ΔdpdA | None | <6 | <6 | <6 | <6 | <6 | |
NC_020158 | Halovirus HVTV-1 | 58.30% | None/dG+ | <6 | 152 ± 3 | 22 ± 1 | 88,607 ± 3014 | <6 |
NC_008197 | Mycobacterium phage Orion | 66.50% | None | <6 | <6 | <6 | <6 | <6 |
NC_004684 | Mycobacterium phage Rosebush | 69.00% | dPreQ0 | 96,530 ± 2529 | 9 ± 1 | <6 | <6 | <6 |
NC_015938 | Salmonella phage 7–11 | 44.10% | None/PreQ0 | <6 | 50 ± 2 | <6 | <6 | <6 |
NC_015274 | Streptococcus phage Dp-1 | 40.30% | dPreQ1/dG+ | <6 | <6 | <6 | 3389 ± 184 | <6 |
NC_021529 | Vibrio phage nt-1 | 41.30% | dG+ | 232 ± 4 | 72 ± 2 | 44 ± 1 | <6 | <6 |
All values represent the mean ± deviation of the mean for two analyses
Phages of the first group encoding both a DpdA and one of the amidotransferase homologs were analyzed. Streptococcus phage Dp-1 DNA, encoding a QueF-L, contained a large amount of dPreQ1 (3389 ± 184 modifications per 106 nucleotides, ~1.7 ± 0.09% of the Gs) but no dG+, which would mean that the QueF-L of this phage would actually be functionally closer to bacterial QueF than archaeal QueF-L, as predicted by the SSN clustering (Supplementary Fig. 2). Vibrio phage nt-1, encoding an ArcS, was shown to harbor not only dG+ (44 ± 1 modifications per 106 nucleotides, ~0.02 ± 0.0005% of the Gs) but also dPreQ0 and dADG (232 ± 4 modifications per 106 nucleotides, ~0.11 ± 0.002% of the Gs, and 72 ± 2 modifications per 106 nucleotides, ~0.035 ± 0.001% of the Gs, respectively). This result might indicate that nt-1 DpdA is more promiscuous and could insert all intermediates of the pathway.
Then we investigated phages of the second group that encode both a DpdA and the four proteins of the preQ0 biosynthesis pathway but no amidotransferase homolog. Mycobacterium phage Rosebush was found to harbor dPreQ0 in its DNA (96,530 ± 2529 modifications per 106 nucleotides, ~28 ± 1% of the Gs), as does Escherichia phage CAjan (70,628 ± 2445 modifications per 106 nucleotides, ~32 ± 1% of the Gs). However, Mycobacterium phage Rosebush was also found to harbor a negligible amount of dADG (9 ± 1 modifications per 106 nucleotides, ~0.003 ± 0.0003% of the Gs).
The genomic DNA of Salmonella phage 7–11 and Mycobacterium phage Orion from the third group of phages, which only encode a DpdA, were also analyzed by LC-MS/MS. Mycobacterium phage Orion lacked any 7-deazaguanine modifications in its DNA. This result was expected, as none of the phage nor the host encode for the preQ0 biosynthesis pathway (Mycobacterium smegmatis, seeSupplementary Data 3). However, Salmonella phage 7–11 was unexpectedly modified by dADG (50 ± 2 modifications per 106 nucleotides, ~0.02 ± 0.0009% of the Gs), suggesting that the phage encoded a protein responsible for the oxidation of preQ0.
Finally, Halovirus HVTV-1, which encodes the four proteins of the preQ0 biosynthesis pathway and a QueF-L homolog but no DpdA, contained mainly dPreQ1 (88,607 ± 3014 modifications per 106 nucleotides, ~30 ± 1% of the Gs) but also relatively small amounts of dADG and dG+ (152 ± 3 modifications per 106 nucleotides, ~0.05 ± 0.001% of the Gs, and 22 ± 1 modifications per 106 nucleotides, ~0.008 ± 0.0003% of the Gs, respectively). As its host, H. vallismortis harbors a DpdA homolog, and it is possible that the host DpdA inserts preQ0 in Halovirus HVTV-1 DNA before it is further modified to dPreQ1 or dG+ by the viral QueF-L or to dADG by another unidentified protein.
dpdA is essential for DNA modification
To evaluate the role of the 7-deazaguanine modifications in phages, we used the Escherichia phage CAjan as a genetic model. CAjan is a virulent phage belonging to the Seuratvirus genus of the Siphoviridae family with many similarities with Enterobacteria phage 9g, particularly within the 7-deazaguanine modification pathway43. Using the CRISPR-Cas9 genome editing technology44, we generated a CAjan derivative with an inactive allele of the dpdA gene (Supplementary Fig. 3A). The presence of this allele was confirmed by PCR and sequencing (Supplementary Fig. 3B). The LC-MS/MS analysis of the DNA of the mutated phage showed a complete lack of 7-deazaguanine modifications (Table 2).
The DNA modifications protect DNA from restriction enzymes
The different modifications present in the phages analyzed above may lead to distinct resistance patterns to host defense mechanisms, such as RM systems. To test this hypothesis, phage DNA preparations were digested with a set of restriction enzymes that had been shown to be totally or partially inactivated in the presence of the dG+ modification29. As a control, we reproduced the results published with Enterobacteria phage 9g DNA (Fig. 5a); no digestion was observed with BamHI, EcoRI, EcoRV, and SwaI, while it was partially restricted with BstXI, HaeIII, MluI, NdeI, and PciI.
Mycobacterium phage Rosebush DNA that carries preQ0 showed a slightly different pattern of resistance. The restriction profiles for BamHI, BstXI, and EcoRV were identical to those of Enterobacteria phage 9g. However, Rosebush DNA was fully sensitive to HaeIII, MluI, and PciI and resisted NdeI degradation (Fig. 5b). EcoRI and SwaI could not be tested because the corresponding sites are absent in the Mycobacterium phage Rosebush genome.
Though Escherichia phage CAjan DNA carries the same modification as Mycobacterium phage Rosebush DNA, differences in the restriction patterns were observed (Fig. 5c). Indeed, while EcoRI and SwaI fully digested this DNA preparation, BamHI digested it only partially, and HaeIII did not cut at all. These differences could be explained by the additional small amount of dADG present in Mycobacterium phage Rosebush DNA, by the differences in modification density potentially affecting accessibility to the restriction sites, or by the presence of another undetected modification. In comparison, the ΔdpdA mutant of CAjan, lacking any modifications, was fully digested by all the tested restriction enzymes (Fig. 5d), formally linking the presence of the dpdA gene and the dG+ modification to the restriction resistance phenotype.
Last but not least, Halovirus HVTV-1 DNA that carries mainly dPreQ1 was found to resist restriction by all enzymes tested, even those that lack guanine in the recognition site (Fig. 5e and Supplementary Fig. 4). It is possible that this virus has other modifications that help resist restriction and, if not dPreQ1, is the best modification for protection from restriction enzymes identified in this study.
Discussion
In a previous study11, we identified two 7-deazaguanine modifications in DNA: dADG in bacteria and dG+ in phages. Here we added two modifications, dPreQ1 and dPreQ0, both found in phages. Similar to the result of Szymanski’s group on Campylobacter phages36, we also detected dADG in phage genomes. We identified the genes involved in the synthesis of these different modifications. FolE, QueD, and QueE from Enterobacteria phage 9g were shown to functionally replace their E. coli orthologs (Fig. 2a), and their clustering in SSNs (Supplementary Fig. 2) leaves no doubt on the isofunctionality of these families. No individual phage QueC was tested, but the strong clustering of bacterial, archaeal, and phage QueC proteins in SSNs also point to identical functions. One exception may be the singleton encoded QueC-like protein, found in Escherichia phage ECML-4 (YP_009101458 in NC_025446) or Mycobacterium phage Muddy (YP_008408902 in NC_022054), which is likely a member of another subfamily of the N-type ATP pyrophosphatases superfamily38.
Most 7-deazaguanine-containing phage genomes also harbor a gene coding for a DpdA homolog. As with its bacterial homolog27, the phage DpdA introduces PreQ0 in DNA (Fig. 2c, Table 1), most likely through a base exchange mechanism similar to its TGT homolog31. DpdA2 proteins appear to share this function, as the Vibrio phage nt-1 genome contains dPreQ0 (Table 2 and Fig. 3). However, not all phages/viruses containing 7-deazaguanines encode DpdA proteins, as observed with Halovirus HVTV-1 (Table 2 and Fig. 3). It is possible that, in the case of HVTV-1, the host DpdA is responsible for the presence of modifications in its genome (EMA11768 in AOLQ01000002). Nevertheless, a DpdA is not always present in the host, and there could be some cases where the phages encode a machinery to synthesize a modified dGTP that is used by DNA polymerase, as proposed for Campylobacter phages36. Finally, one cannot rule out that some phages may harbor undetected 2′-deoxyribosyltransferases.
The combination of comparative genomic analyses and experimental validations has allowed pathways for the insertion of dPreQ0, dPreQ1, and dG+ in phage genomes to be predicted (Fig. 6). The presence of the minimal set of FolE, QueD, QueE, QueC, and DpdA proteins leads to the insertion of dPreQ0, as observed in Mycobacterium phage Rosebush and Escherichia phage CAjan genomes (Table 2 and Fig. 3). The replacement of QueC by Gat-QueC leads to the introduction of dG+ (Fig. 2c, Table 1 and previous study11). However, it is not known whether Gat-QueC converts preQ0 into G+ before or after it is inserted into DNA. The function of ArcS homologs in phages/viruses is less clear. Indeed, Vibrio phage nt-1 encodes an ArcS homolog, and its DNA contains mainly dPreQ0 but also dG+ and dADG (Table 2 and Fig. 3). ArcS was the first G+ synthase identified in archaea15. Based on the phage and archaeal ArcS cluster in the SNNs (Supplementary Fig. 2), it is possible that some phage ArcS protein evolved to perform not only an amidotransferase reaction, such as the archaeal ArcS15, but also an amidohydrolase reaction, such as the bacterial DpdC27. Further biochemical characterization will be required to explore these hypotheses. One cannot exclude the possibility that the small amount of dADG detected in Vibrio phage nt-1, Halovirus HVTV-1, Mycobacterium phage Rosebush, and Escherichia phage CAjan could be the result of the natural oxidation of dPreQ045.
The discrepancy observed between the SSNs and HHpred predictions for the QueF/QueF-L homologs was resolved by analyzing Streptococcus phage Dp-1 and Halovirus HVTV-1 DNA. HHpred analysis predicted that a homolog of the archaeal QueF-L, which synthesizes G+-tRNA from the preQ0-tRNA46, was encoded by these phages, whereas the SSN analysis predicted that this same protein was part of a group of bacterial QueF proteins (Fig. 4) that synthesize preQ1 from the free preQ0 base18. We found that Streptococcus phage Dp-1 and Halovirus HVTV-1 were modified by dPreQ1, confirming the SSN prediction. However, it is unclear whether the reduction occurs on free preQ0, similar to the bacterial QueF proteins18, and then the free base preQ1 is inserted by DpdA or if the phage QueF is able to modify the DNA-bound dPreQ0, as does the archaeal QueF-L with tRNA46. However, Halovirus HVTV-1 contains mainly dPreQ1 but also a small amount of dADG and dG+. It is possible that the QueF-L transitions between its function as an amidohydrolase to an amidotransferase, but one cannot rule out that the host ArcS could catalyze the reaction, although the PUA domain specific for tRNA binding makes it highly unlikely15.
From a biological perspective, 7-deazaguanine modifications seem to dramatically decrease the susceptibility of phage genomes to host RM systems. RM systems are one of the major defense systems for bacteria to prevent invasion by foreign DNA5. Phages evolved to escape these RM systems by different methods, including modification of their genomic DNA9,11,47,48. It was previously observed that the genome of Enterobacteria phage 9g contains dG+11 and is fully or partially resistant to a wide variety of restriction enzymes29. In this study, we directly linked the presence of the modification to the restriction resistance phenotype. Escherichia phage CAjan with mutations in dpdA no longer contains dPreQ0 modifications (Table 2) and is sensitive to all the restriction enzymes tested (Fig. 5). In addition, all 7-deazaguanine-modified DNA preparations tested were protected to various degrees from digestion by restriction enzymes. We also observed that introducing dG+ modifications in the E. coli genome protected against cleavage by EcoRI (Fig. 2). These modifications might also block other DNA-binding proteins that require the nitrogen moiety at position 7 of the guanine to recognize their substrates, the most critical being sigma and transcription factors. However, phages only use the housekeeping sigma factor49, which has an AT-rich recognition sequence50, and encode their own transcription factors51.
Finally, the distribution of these modifications among phages seems to correlate with their host range, namely, bacterial pathogenic species. Interestingly, this was also observed in bacteria, where many pathogens harbor dADG modifications11. Although it is not clear how 7-deazaguanine modifications are spread through phage isolates, these modifications might give a selective advantage to pathogenic species. These 7-deazaguanine-modified phages are also most likely more adapted to propagate in hosts with modified DNA. We can only speculate on how bacteria evolve to counteract this specific anti-restriction mechanism. As we were successful in deleting the dpdA gene from Escherichia phage CAjan using a CRISPR-Cas9 technique (see “Methods”), we know that these modifications do not provide resistance against the type II CRISPR-Cas system4. However, as the adaptive system of CRISPR-Cas recognizes the nitrogen in position 7 of the guanines in the PAM52, it is possible that these phages escape degradation by CRISPR-Cas by preventing the adaptation system from binding to its target DNA. One could also imagine that other means of defense, described in recent reviews2,3, provide an efficient protection mechanism against these phages or that some bacteria evolved means of defense yet to be discovered.
Methods
Strains, phages, plasmids, and oligonucleotides
The bacterial strains used in this study are listed in Supplementary Data 4. Phages are listed in Supplementary Data 5. Plasmids are listed in Supplementary Table 3, and plasmid constructions are described in Supplementary Information. Oligonucleotides are listed in Supplementary Data 6.
Q detection in tRNA
Overnight bacterial cultures were diluted 1/100-fold into 5 mL of LB supplemented with 0.4% arabinose and 100 µg/mL ampicillin and grown for 2 h at 37 °C. Cells were harvested by centrifugation at 16,000 × g for 1 min at 4 °C. Cell pellets were immediately resuspended in 1 mL of Trizol (Life Technologies, Carlsbad, CA). Small RNAs were extracted using the PureLinkTM miRNA Isolation Kit from Invitrogen (Carlsbad, CA) according to the manufacturer’s protocol. Purified RNAs were eluted in 50 μL of RNase-free water, and tRNA concentrations were measured with a NanoDrop® ND-1000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA). Then 200 ng of RNA was migrated in a 10% acrylamide/bisacrylamide (29:1), Tris-EDTA acetate (TAE) 1×, Urea 8 M supplemented with 5 µg/mL 3-(acrylamido)-phenylboronic acid, as described in detail previously27. The migrated samples were transferred onto a BiodyneTM B Nylon membrane (0.45 µm, Thermo Scientific, Rockford, IL). tRNA samples were detected using a (5′-biotin-CCCTCGGTGACAGGCAGG-3′) probe that anneals with tRNAAsp(GUC) at a final concentration of 0.3 μM and the Chemiluminescent Nucleic Acid Detection Module Kit (Thermo Scientific, Rockford, IL), except that the first blocking buffer was changed to the DIG Easy Hyp buffer (Roche, Mannheim, Germany).
Restriction assay for deazapurine presence in plasmid DNA
E. coli strains containing different variations of pBAD24 and pBAD33 (with or without dpdA or gat-queC from Enterobacteria phage 9g, see Supplementary Information) were grown overnight in LB supplemented with ampiciline 100 µg/mL, chloramphenicol 20 µg/mL and 0.2% glucose at 37 °C. Each strain was diluted 100-fold in LB supplemented with ampiciline 100 µg/mL, chloramphenicol 20 µg/mL and 0.4% arabinose and grown for 6 h at 37 °C. Plasmids were extracted using the Qiagen QIAprep Spin Miniprep Kit, and 500 ng of plasmid was digested by EcoRI-HF (New England Biolabs, Ipswich MA) for 1 h at 37 °C in 20 µL of CutSmart buffer. The enzyme was inactivated by 20-min incubation at 80 °C. The samples were run on a 0.5% agarose gel and TAE 1×. The gel was then stained with 0.5 µg/mL ethidium bromide for 30 min, washed 3 times for 15 min in water, and visualized with the Azur Biosystem c200 Gel Doc system (Thermo Fisher Scientific, Waltham, MA, USA).
Search for phage encoding Q and G+ biosynthesis proteins
The Viruses nr database from NCBI was queried by three iterations of PSI-BLAST53, with the default set-up as previously suggested54, using the proteins referenced in Supplementary Table 1 known to be involved in Q or G+ biosynthesis, as well as DpdA from Enterobacteria phage 9g, predicted to be involved in the modification of phage DNA, and another DpdA2 from Vibrio phage nt-1, part of a family identified in this study. The preQ0-specific transporter YhhQ33 was also added. For each virus identified with at least one of these genes, a reverse analysis was performed (phage genome against the protein list) to ensure that no protein was missed during the first analysis. The annotations for each identified ortholog were verified by HHpred55.
SSN generation
For each protein family (FolE, QueD, QueE, QueC/Gat-QueC, QueF/QueF-L, ArcS, and TGT), a representative set was imported from the OMA database56. For the DpdA from bacteria, the protein sequences were imported from the genomes identified previously11 through PubSEED42. To generate the protein network, the sequences in fasta format were uploaded and analyzed online by the EFI-EST tool37. Each network was analyzed using the Cytoscape program57, and each family was clustered using the alignment score thresholds indicated in Fig. 3 and Supplementary Fig. 2.
Identification of the host and their gene content
The Virus-Host DB41 was used to obtain the host information for each phage identified in this study. For phages not referenced in this database, a manual investigation coupling RefSeq39 and the literature was performed (indicated as “manual” in the evidence line of Supplementary Data 3). Each host identified was queried in the Globi database40, and if they were identified as pathogens, the host was entered in the “Pathogen Of” column of Supplementary Data 3. The same analysis was performed for all the dsDNA phages of the Virus-Host DB, as only these phages were returned in our analysis (data not shown). A list of genomes was created on PubSEED42 from the identified hosts, and a spreadsheet was created. Proteins from Supplementary Table 1 were used to identify the correct annotation for each column of the spreadsheet. The results were collected and are shown in Supplementary Data 3.
Purification of phage and plasmid DNA
The purification of each phage DNA in this study was performed specifically for each phage and is described in Supplementary Information.
Mass spectrometric analysis
DNA analysis was performed as previously described with several modifications11. Purified DNA (20 μg) was hydrolyzed in 10 mM Tris-HCl (pH 7.9) with 1 mM MgCl2 with benzonase (20 U), DNase I (4 U), calf intestine phosphatase (17 U), and phosphodiesterase (0.2 U) for 16 h at ambient temperature. Following passage through a 10-kDa filter to remove proteins, the filtrate was lyophilized and resuspended to a final concentration of 0.2 µg/µL (based on initial DNA quantity).
Quantification of the modified 2′-deoxynucleosides (dADG, dQ, dPreQ0, dPreQ1, and dG+) and the four canonical 2′-deoxyribonucleosides (dA, dT, dG, and dC) was achieved by LC-MS/MS and an in-line diode array detector (LC-DAD), respectively. Aliquots of hydrolyzed DNA were injected onto a Phenomenex Luna Omega Polar C18 column (2.1 × 100 mm, 1.6 μm particle size) equilibrated with 98% solvent A (0.1% v/v formic acid in water) and 2% solvent B (0.1% v/v formic acid in acetonitrile) at a flow rate of 0.25 mL/min and eluted with the following solvent gradient: 12% B for 10 min, 1 min ramp to 100% B for 10 min, 1 min ramp to 2% B for 10 min. The high-performance liquid chromatographic column was coupled to an Agilent 1290 Infinity DAD and an Agilent 6490 triple quadruple mass spectrometer (Agilent, Santa Clara, CA). The column was kept at 40 °C, and the autosampler was cooled at 4 °C. The ultraviolet wavelength of the DAD was set at 260 nm and the electrospray ionization of the mass spectrometer was performed in positive ion mode with the following source parameters: drying gas temperature, 200 °C with a flow of 14 L/min; nebulizer gas pressure, 30 psi; sheath gas temperature, 400 °C with a flow of 11 L/min; capillary voltage, 3,000 V; and nozzle voltage, 800 V. Compounds were quantified in multiple reaction monitoring mode with the following m/z transitions: 310.1 → 194.1, 310.1 → 177.1, 310.1 → 293.1 for dADG; 394.1 → 163.1, 394.1 → 146.1, 394.1 → 121.1 for dQ; 292.1 → 176.1, 176.1 → 159.1, 176.1 → 52.1 for dPreQ; 296.1 → 163.1, 296.1 → 121.1, 296.1 → 279.1 for dPreQ1; and 309.1 → 193.1, 309.1 → 176.1, 309.1 → 159.1 for dG+. External calibration curves were used to quantify the modified canonical 2′-deoxynucleosides. Calibration curves were constructed from replicate measurements of eight concentrations of each standard. A linear regression with r2 > 0.995 was obtained in all relevant ranges. The limit of detection, defined by a signal-to-noise ratio ≥3, ranged from 0.1 to 1 fmol for the modified 2′-deoxynucleosides. Data acquisition and processing were performed using the MassHunter software (Agilent, Santa Clara, CA).
Phage genome editing using CRISPR-Cas9
Escherichia phage CAjan was genetically engineered as previously described58 and as summarized in Supplementary Fig. 2A. Briefly, E. coli MG1655 was transformed with two plasmids, pL2Cas9_dpdAΔ (see Supplementary Information for detailed construction method), which contained a spacer (5′-TGCGGTCAAGCCAAGTCTTAAGCGTGTCCG-3′) targeting the dpdA gene of Escherichia phage CAjan, and pNZ123_dpdAΔ (see Supplementary Information for detailed construction method), which carried a homologous repair template with a partially deleted, nonfunctional allele of the dpdA gene (del29212-29521). Phage engineering was accomplished by infecting the modified host with WT Escherichia phage CAjan and isolating the resulting phage mutants. The infection step was repeated twice, and the resulting mutants were verified by PCR and whole-genome sequencing as described elsewhere59.
Restriction assay of phage DNA
A total of 250 ng of phage DNA was digested by the enzymes (New England Biolabs) described in Fig. 6 for 1 h at 37 °C in 20 µL of CutSmart or 3.1 Buffer solution, according to the manufacturer’s instructions. The enzymes were inactivated by incubation at 80 °C for 20 min. The samples were run on a 0.7% agarose gel and TAE 1×. The gel was then stained for 30 min in 0.5 μg/mL ethidium bromide, washed 3 times for 15 min in water, and visualized with the Azur Biosystem c200 Gel Doc system.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
This work was funded in part by the National Institutes of Health (grant GM70641 to V.d.C.-L. and P.C.D.), the Human Frontier Science Program (grant RGP0024 to L.H., V.d.C.-L., and S.M.), and the Villum Experiment (grant 17595 to W.K.). We thank the 2009 and 2016 MIT students of the 7.396 independent activity period class who isolated the Pseudomonas phage Quinobequin P09; Cameron Haase-Pettingell for assistance with electron microscopy; Dennis Bamford for Halovirus HVTV-1 and its host Haloarcula vallismortis and Audrey Jonas for preparations of phages Orion and Rosebush; Rémi Zallot for SSN tutorials and Gabriella Phillips for pCH111 plasmid construction; and Marie-Laurence Lemay for her help with the genome editing of Escherichia phage CAjan. We are grateful to Marie-Agnès Petit for critical reading of the manuscript and Christine Szymanski for sharing information on Campylobacter phages. S.M. holds the Tier 1 Canada Research Chair in Bacteriophages.
Source data
Author contributions
G.H., W.K., L.C., R.H., S.B., S.G., R.N., A.B.C., C.F.L., M.S., Y.J.L., and P.W. performed the experimental work. G.H., W.K., and L.C. contributed to the manuscript preparation. D.T., D.J.-S., S.M., G.F.H., P.C.D., L.H.H., and V.d.C.-L. contributed their expertise and supervision to the work. G.H. and V.d.C.-L. conceived the idea and supervised the entire project. G.H., W.K., L.C., M.S., P.W., and V.d.C.-L. wrote the manuscript.
Data availability
Data supporting the findings of this work are available within the paper and its Supplementary Information files. A reporting summary for this article is available as a Supplementary Information file. The datasets generated and analyzed during the current study are available in Supplementary Information or from the corresponding author upon request. The source data are provided as a Source Data file.
Competing interests
The authors declare no competing interests.
Footnotes
Peer review information Nature Communications thanks Lawrence Sowers, Shuang-yong Xu2 and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Geoffrey Hutinet, Email: ghutinet@ufl.edu.
Valérie de Crécy-Lagard, Email: vcrecy@ufl.edu.
Supplementary information
Supplementary information is available for this paper at 10.1038/s41467-019-13384-y.
References
- 1.Chopin MC, Chopin A, Bidnenko E. Phage abortive infection in lactococci: variations on a theme. Curr. Opin. Microbiol. 2005;8:473–479. doi: 10.1016/j.mib.2005.06.006. [DOI] [PubMed] [Google Scholar]
- 2.Labrie SJ, Samson JE, Moineau S. Bacteriophage resistance mechanisms. Nat. Rev. Microbiol. 2010;8:317–327. doi: 10.1038/nrmicro2315. [DOI] [PubMed] [Google Scholar]
- 3.Golais F, Hollý J, Vítkovská J. Coevolution of bacteria and their viruses. Folia Microbiol. (Praha) 2013;58:177–186. doi: 10.1007/s12223-012-0195-5. [DOI] [PubMed] [Google Scholar]
- 4.Makarova KS, et al. An updated evolutionary classification of CRISPR-Cas systems. Nat. Rev. Microbiol. 2015;13:722–736. doi: 10.1038/nrmicro3569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ershova AS, Rusinov IS, Spirin SA, Karyagina AS, Alexeevski AV. Role of restriction-modification systems in prokaryotic evolution and ecology. Biochemistry (Mosc.) 2015;80:1373–1386. doi: 10.1134/S0006297915100193. [DOI] [PubMed] [Google Scholar]
- 6.Samson JE, Magadán AH, Sabri M, Moineau S. Revenge of the phages: defeating bacterial defences. Nat. Rev. Microbiol. 2013;11:675–687. doi: 10.1038/nrmicro3096. [DOI] [PubMed] [Google Scholar]
- 7.Borges AL, Davidson AR, Bondy-Denomy J. The discovery, mechanisms, and evolutionary impact of anti-CRISPRs. Annu. Rev. Virol. 2017;4:37–59. doi: 10.1146/annurev-virology-101416-041616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Pawluk A, Davidson AR, Maxwell KL. Anti-CRISPR: discovery, mechanism and function. Nat. Rev. Microbiol. 2018;16:12–17. doi: 10.1038/nrmicro.2017.120. [DOI] [PubMed] [Google Scholar]
- 9.Bryson AL, et al. Covalent modification of bacteriophage T4 DNA inhibits CRISPR-Cas9. MBio. 2015;6:e00648. doi: 10.1128/mBio.00648-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Flodman K, et al. Type II restriction of bacteriophage DNA with 5hmdU-derived base modifications. Front. Microbiol. 2019;10:1–13. doi: 10.3389/fmicb.2019.00584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Thiaville JJ, et al. Novel genomic island modifies DNA with 7-deazaguanine derivatives. Proc. Natl Acad. Sci. USA. 2016;113:E1452–E1459. doi: 10.1073/pnas.1518570113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Reader JS, Metzgar D, Schimmel P, De Crécy-Lagard V. Identification of four genes necessary for biosynthesis of the modified nucleoside queuosine. J. Biol. Chem. 2004;279:6280–6285. doi: 10.1074/jbc.M310858200. [DOI] [PubMed] [Google Scholar]
- 13.Phillips G, et al. Biosynthesis of 7-deazaguanosine-modified tRNA nucleosides: a new role for GTP cyclohydrolase I. J. Bacteriol. 2008;190:7876–7884. doi: 10.1128/JB.00874-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.McCarty RM, Bandarian V. Biosynthesis of pyrrolopyrimidines. Bioorg. Chem. 2012;43:15–25. doi: 10.1016/j.bioorg.2012.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Phillips G, et al. Discovery and characterization of an amidinotransferase involved in the modification of archaeal tRNA. J. Biol. Chem. 2010;285:12706–12713. doi: 10.1074/jbc.M110.102236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Phillips G, et al. Diversity of archaeosine synthesis in Crenarchaeota. ACS Chem. Biol. 2012;7:300–305. doi: 10.1021/cb200361w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Bon Ramos A, Bao L, Turner B, de Crécy-Lagard V, Iwata-Reuyl D. QueF-like, a non-homologous Archaeosine synthase from the Crenarchaeota. Biomolecules. 2017;7:1–14. doi: 10.3390/biom7020036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Van Lanen SG, et al. From cyclohydrolase to oxidoreductase: discovery of nitrile reductase activity in a common fold. Proc. Natl Acad. Sci. USA. 2005;102:4264–4269. doi: 10.1073/pnas.0408056102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Stengl B, Reuter K, Klebe G. Mechanism and substrate specificity of tRNA-guanine transglycosylases (TGTs): tRNA-modifying enzymes from the three different kingdoms of life share a common catalytic mechanism. ChemBioChem. 2005;6:1926–1939. doi: 10.1002/cbic.200500063. [DOI] [PubMed] [Google Scholar]
- 20.Van Lanen SG, Iwata-Reuyl D. Kinetic mechanism of the tRNA-modifying enzyme S-adenosylmethionine:tRNA ribosyltransferase-isomerase (QueA) Biochemistry. 2003;42:5312–5320. doi: 10.1021/bi034197u. [DOI] [PubMed] [Google Scholar]
- 21.Miles ZD, McCarty RM, Molnar G, Bandarian V. Discovery of epoxyqueuosine (oQ) reductase reveals parallels between halorespiration and tRNA modification. Proc. Natl Acad. Sci. USA. 2011;108:7368–7372. doi: 10.1073/pnas.1018636108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zallot R, et al. Identification of a novel epoxyqueuosine reductase family by comparative genomics. ACS Chem. Biol. 2017;12:844–851. doi: 10.1021/acschembio.6b01100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Carstens AB, Kot W, Hansen LH. Complete genome sequences of four novel Escherichia coli bacteriophages belonging to new phage groups. Genome Announc. 2015;3:e00741–15. doi: 10.1128/genomeA.00741-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sabri M, et al. Genome annotation and intraviral interactome for the streptococcus pneumoniae virulent phage Dp-1. J. Bacteriol. 2011;193:551–562. doi: 10.1128/JB.01117-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kot W, et al. Complete genome sequence of Streptococcus pneumoniae virulent phage MS1. Genome Announc. 2017;5:9–10. doi: 10.1128/genomeA.00333-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Pedulla ML, et al. Origins of highly mosaic mycobacteriophage genomes. Cell. 2003;113:171–182. doi: 10.1016/S0092-8674(03)00233-2. [DOI] [PubMed] [Google Scholar]
- 27.Yuan Yifeng, Hutinet Geoffrey, Valera Jacqueline Gamboa, Hu Jennifer, Hillebrand Roman, Gustafson Andrew, Iwata-Reuyl Dirk, Dedon Peter C., de Crécy-Lagard Valérie. Identification of the minimal bacterial 2′-deoxy-7-amido-7-deazaguanine synthesis machinery. Molecular Microbiology. 2018;110(3):469–483. doi: 10.1111/mmi.14113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kulikov E, et al. Genomic sequencing and biological characteristics of a novel Escherichia coli bacteriophage 9g, a putative representative of a new Siphoviridae genus. Viruses. 2014;6:5077–5092. doi: 10.3390/v6125077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Tsai R, Corrêa IR, Xu MY, Xu SY. Restriction and modification of deoxyarchaeosine (dG+)-containing phage 9 g DNA. Sci. Rep. 2017;7:1–13. doi: 10.1038/s41598-016-0028-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Mačková M, Boháčová S, Perlíková P, Poštová Slavětínská L, Hocek M. Polymerase synthesis and restriction enzyme cleavage of DNA containing 7-substituted 7-deazaguanine nucleobases. ChemBioChem. 2015;16:2225–2236. doi: 10.1002/cbic.201500315. [DOI] [PubMed] [Google Scholar]
- 31.Hutinet G, Swarjo MA, de Crécy-Lagard V. Deazaguanine derivatives, examples of crosstalk between RNA and DNA modification pathways. RNA Biol. 2017;14:1175–1184. doi: 10.1080/15476286.2016.1265200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hanson AD, Gregory JF. Synthesis and turnover of folates in plants. Curr. Opin. Plant Biol. 2002;5:244–249. doi: 10.1016/S1369-5266(02)00249-2. [DOI] [PubMed] [Google Scholar]
- 33.Zallot R, Yuan Y, De Crecy-Lagard V. The Escherichia coli COG1738 member YhhQ is involved in 7-cyanodeazaguanine (preQ0) transport. Biomolecules. 2017;7:1–13. doi: 10.3390/biom7010012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Russell DA, Hatfull GF. PhagesDB: the actinobacteriophage database. Bioinformatics. 2017;33:784–786. doi: 10.1093/bioinformatics/btw711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Tuorto, F. et al. Queuosine‐modified tRNAs confer nutritional control of protein translation. EMBO J. 37, e99777 (2018). [DOI] [PMC free article] [PubMed]
- 36.Crippen, C. S. et al. Two subfamilies of Campylobacter jejuni bacteriophages replace genomic deoxyguanosine with alternative nucleobases. J. Virol. 10.1128/JVI.01111-19 (2019).
- 37.Cicmil N, Huang RH. Crystal structure of QueC from Bacillus subtilis: an enzyme involved in preQ1biosynthesis. Proteins Struct. Funct. Genet. 2008;72:1084–1088. doi: 10.1002/prot.22098. [DOI] [PubMed] [Google Scholar]
- 38.Gerlt JA, et al. Enzyme function initiative-enzyme similarity tool (EFI-EST): a web tool for generating protein sequence similarity networks. Biochim. Biophys. Acta. 2015;1854:1019–1037. doi: 10.1016/j.bbapap.2015.04.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.O’Leary NA, et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 2016;44:D733–D745. doi: 10.1093/nar/gkv1189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Poelen JH, Simons JD, Mungall CJ. Global biotic interactions: an open infrastructure to share and analyze species-interaction datasets. Ecol. Informatics. 2014;24:148–159. doi: 10.1016/j.ecoinf.2014.08.005. [DOI] [Google Scholar]
- 41.Mihara T, et al. Linking virus genomes with host taxonomy. Viruses. 2016;8:10–15. doi: 10.3390/v8030066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Overbeek R, et al. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res. 2005;33:5691–5702. doi: 10.1093/nar/gki866. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Carstens AB, Kot W, Lametsch R, Neve H, Hansen LH. Characterisation of a novel enterobacteria phage, CAjan, isolated from rat faeces. Arch. Virol. 2016;161:2219–2226. doi: 10.1007/s00705-016-2901-0. [DOI] [PubMed] [Google Scholar]
- 44.Lemay M-L, Renaud A, Rousseau G, Moineau S. Targeted genome editing of virulent phages using CRISPR-Cas9. Bio-protocol. 2018;7:1–19. doi: 10.21769/BioProtoc.2674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Vourvahis Manoli, Gleave Michelle, Nedderman Angus N. R., Hyland Ruth, Gardner Iain, Howard Martin, Kempshall Sarah, Collins Claire, LaBadie Robert. Excretion and Metabolism of Lersivirine (5-{[3,5-Diethyl-1-(2-hydroxyethyl)(3,5-14C2)-1H-pyrazol-4-yl]oxy}benzene-1,3-dicarbonitrile), a Next-Generation Non-Nucleoside Reverse Transcriptase Inhibitor, after Administration of [14C]Lersivirine to Healthy Volunteers. Drug Metabolism and Disposition. 2010;38(5):789–800. doi: 10.1124/dmd.109.031252. [DOI] [PubMed] [Google Scholar]
- 46.Mei X, et al. Crystal structure of the archaeosine synthase QueF-like–insights into amidino transfer and tRNA recognition by the tunnel fold. Proteins. 2016;165:255–269. doi: 10.1002/prot.25202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Weigele, P. & Raleigh, E. A. Biosynthesis and function of modified bases in bacteria and their viruses. Chem. Rev. 116, 12655–12687 (2016). [DOI] [PubMed]
- 48.Lee Yan-Jiun, Dai Nan, Walsh Shannon E., Müller Stephanie, Fraser Morgan E., Kauffman Kathryn M., Guan Chudi, Corrêa Ivan R., Weigele Peter R. Identification and biosynthesis of thymidine hypermodifications in the genomic DNA of widespread bacterial viruses. Proceedings of the National Academy of Sciences. 2018;115(14):E3116–E3125. doi: 10.1073/pnas.1714812115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Nechaev S, Severinov K. Bacteriophage-induced modifications of host RNA polymerase. Annu. Rev. Microbiol. 2004;57:301–322. doi: 10.1146/annurev.micro.57.030502.090942. [DOI] [PubMed] [Google Scholar]
- 50.Feklístov A, Sharon BD, Darst SA, Gross CA. Bacterial sigma factors: a historical, structural, and genomic perspective. Annu. Rev. Microbiol. 2014;68:357–376. doi: 10.1146/annurev-micro-092412-155737. [DOI] [PubMed] [Google Scholar]
- 51.Yang H, et al. Transcription regulation mechanisms of bacteriophages. Bioengineered. 2014;5:300–304. doi: 10.4161/bioe.32110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Gleditzsch D, et al. PAM identification by CRISPR-Cas effector complexes: diversified mechanisms and structures. RNA Biol. 2019;16:504–517. doi: 10.1080/15476286.2018.1504546. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Altschul SF, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Lopes A, Amarir-Bouhram J, Faure G, Petit MA, Guerois R. Detection of novel recombinases in bacteriophage genomes unveils Rad52, Rad51 and Gp2.5 remote homologs. Nucleic Acids Res. 2010;38:3952–3962. doi: 10.1093/nar/gkq096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Söding J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005;21:951–960. doi: 10.1093/bioinformatics/bti125. [DOI] [PubMed] [Google Scholar]
- 56.Altenhoff AM, et al. The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces. Nucleic Acids Res. 2018;46:D477–D485. doi: 10.1093/nar/gkx1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Shannon P. Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Research. 2003;13(11):2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Lemay ML, Tremblay DM, Moineau S. Genome engineering of virulent lactococcal phages using CRISPR-Cas9. ACS Synth. Biol. 2017;6:1351–1358. doi: 10.1021/acssynbio.6b00388. [DOI] [PubMed] [Google Scholar]
- 59.Kot W, Vogensen FK, Sørensen SJ, Hansen LH. DPS—a rapid method for genome sequencing of DNA-containing bacteriophages directly from a single plaque. J. Virol. Methods. 2014;196:152–156. doi: 10.1016/j.jviromet.2013.10.040. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data supporting the findings of this work are available within the paper and its Supplementary Information files. A reporting summary for this article is available as a Supplementary Information file. The datasets generated and analyzed during the current study are available in Supplementary Information or from the corresponding author upon request. The source data are provided as a Source Data file.