Skip to main content
Applied and Environmental Microbiology logoLink to Applied and Environmental Microbiology
. 2022 Mar 14;88(7):e00060-22. doi: 10.1128/aem.00060-22

Mosaic Evolution of Beta-Barrel-Porin-Encoding Genes in Escherichia coli

Xiongbin Chen a,#, Xuxia Cai a,#, Zewei Chen a,#, Jinjin Wu a, Gaofeng Hao b, Quan Luo b, Shuhong Liu a, Junya Zhang a, Yueming Hu a, Guoqiang Zhu c, Wolfgang Koester d, Aaron P White d, Yi Cai e,, Yejun Wang a,
Editor: Robert M Kellyf
PMCID: PMC9004372  PMID: 35285711

ABSTRACT

Bacterial porin-encoding genes are often found under positive selection. Local recombination has also been identified in a few of them to facilitate bacterial rapid adaptation, although it remains unknown whether it is a common evolutionary mechanism for the porins or outer membrane proteins in Gram-negative bacteria. In this study, we investigated the beta-barrel (β-barrel) porin-encoding genes in Escherichia coli that were reported under positive Darwinian selection. Besides fhuA that was found with ingenic local recombination previously, we identified four other genes, i.e., lamB, ompA, ompC, and ompF, all showing the similar mosaic evolution patterns. Comparative analysis of the protein sequences disclosed a list of highly variable regions in each family, which are mostly located in the convex of extracellular loops and coinciding with the binding sites of bacteriophages. For each of the porin families, mosaic recombination leads to unique combinations of the variable regions with different sequence patterns, generating diverse protein groups. Structural modeling indicated a conserved global topology among the different porins, with the extracellular surface varying a lot due to individual or combinatorial variable regions. The conservation of global tertiary structure would ensure the channel activity, while the wide diversity of variable regions may represent selection to avoid the invasion of phages, antibiotics or immune surveillance factors. Our study identified multiple bacterial porin genes with mosaic evolution. We hypothesize that this could be generalized strategy for outer membrane proteins to both maintain normal life processes and evade the attack of unfavored factors rapidly.

IMPORTANCE Microevolution studies can disclose more elaborate evolutionary mechanisms of genes, appearing especially important for genes with multifaceted function such as those encoding outer membrane proteins. However, in most cases, the gene is considered as a whole unit, and the evolutionary patterns are disclosed. Here, we report that multiple bacterial porin proteins follow mosaic evolution, with local ingenic recombination combined with spontaneous mutations based on positive Darwinian selection, and conservation for most structural regions. This could represent a common mechanism for bacterial outer membrane proteins. The variable regions within each porin family showed large coincidence with the binding sites of bacteriophages, antibiotics, and immune factors and therefore would represent effective targets for the development of new antibacterial agents or vaccines.

KEYWORDS: mosaic evolution, local recombination, β-barrel porin, FhuA, Escherichia coli, LamB, OmpA, OmpC, OmpF

INTRODUCTION

Homologous recombination not only happens in “non-core” accessory genes that are essential for the population of a bacterial lineage but also frequently in highly conserved “core” genes (13). It can facilitate the adaptive evolution processes of bacterial organisms, as has been demonstrated by the observation of paralleled increase of adaptive evolution and the rates of homologous recombination (46). Annotation and functional enrichment analysis on the genes predicted to have experienced recombination suggest that outer membrane transporter encoding genes represent one of the major essential gene families with frequent recombination (3, 7). However, for most bacterial species, it is yet unknown how often and for which genes recombination happens and how recombination influences the evolution of the microorganisms.

Recently, one of the outer membrane transporter genes, fhuA, was reported with local recombination in Escherichia coli (8). The gene was also reported to be under positive Darwinian selection in E. coli previously, a process mediated by spontaneous mutations (9, 10). The main recombination fragments coincide with the sites under positive selection, suggesting the paralleled positive selection within the recombination regions, i.e., mosaic evolution patterns (8). FhuA is one of the β-barrel outer membrane proteins (OMPs), which provide multiple functions in Gram-negative bacteria, for instance, the transport of various substrates and signal transduction, facilitating bacterial fitting in different environment (11). Typically, the β-barrel OMPs form closed barrels embraced by antiparallel β-strands with hydrophobic residues toward bacterial outer membrane and hydrophilic residues toward the inner surface of the channels (11, 12). In addition to fhuA, multiple other β-barrel OMP encoding genes have also been found under positive Darwinian selection in Escherichia coli, such as ompA, ompC, ompF, etc. (9, 10). Similarly, local recombination was also reported in ompA and ompF in Chlamydia and Yersinia spp., respectively (13, 14). However, it remains unknown whether there are other E. coli β-barrel OMP genes that show mosaic evolution patterns.

Like FhuA, many other β-barrel OMPs reported to be under positive selection are also receptors for various substrates and have multifaceted functions, e.g., OmpA, OmpC, and OmpF. OmpA is a major heat-modifiable porin with a molecular weight of 28 to 36 kDa. The N terminus of OmpA forms an eight-stranded, antiparallel, hydrophobic β-barrel domain, while the C terminus forms a hydrophilic, globular domain located in the periplasm (15). OmpA participates in various processes of bacterial infections, such as bacterial adhesion, invasion, intracellular survival, and escape from host defenses (16, 17). OmpA is also the receptor of multiple phages and colicins (U and L) and is a good candidate for vaccine design since the outer membrane fragments could elicit effective immune responses (1719). OmpC and OmpF are classical porin proteins, abundantly distributed in the E. coli outer membrane and essential for bacterial survival (20). They are also the main channels via which antibiotics and other virulence factors enter bacterial cells (2123). The decreased expression of these genes could be related with bacterial resistance to the antibiotics (24). Like OmpA and FhuA, OmpC and OmpF are also the receptors for bacteriophages (2527).

Here, we investigated the β-barrel OMP genes under positive selection reported previously, including ompA, ompC, ompF, and others, and we observed whether they showed mosaic evolution patterns with local recombination and positive selection. Furthermore, sequence analysis and structural modeling were performed on these genes to explore the underlying mechanisms and the functional relevance of the mosaic evolution.

RESULTS

Screening of Escherichia porin genes under mosaic evolution.

Besides FhuA, the protein sequences of four other β-barrel porins (LamB, OmpA, OmpC, and OmpF) also showed clustering patterns incongruent from the phylogenomic tree of E. coli lineages or other β-barrel porin proteins (Fig. 1; see also Fig. S1 in the supplemental material). For instance, E. coli MG1655, P12b, and UMNK88 are three evolutionarily close strains of lineage A, as was confirmed by the E. coli core-proteome tree (Fig. 1). However, the LamB protein sequences from these three strains were located in different phylogenetic clades, while OmpA and OmpC from MG1655/P12b and UMNK88 were also located in different sequence clades (Fig. 1). Meanwhile, sequences of the strains from different lineages clustered, e.g., LamB proteins of P12b from lineage A, 11368 from lineage B1, and UMN026 from lineage D1 (Fig. 1). In contrast, PurR, another representative β-barrel porin protein, the sequences from the same E. coli lineage were clustered together, and the sequences generally followed the phylogenetic routes of lineages or core genomes (see Fig. S1).

FIG 1.

FIG 1

Phylogenetic clustering of E. coli porin proteins. NJ unrooted trees were built for LamB, OmpA, OmpC, and OmpF. For each protein family or the core proteome, the robustness of the phylogenetic tree was examined by bootstrapping tests with 1,000 replicates, and the scores are indicated for nodes of no lower than 50%. The gene locus tag for each protein and the strain name were indicated, and strains from the same E. coli (or E. fergusonii) lineage are labeled with a unique colored sign.

The within-lineage amino acid substitution rates of LamB, OmpA, OmpC, and OmpF were compared to interlineage substitution rates of the corresponding proteins, respectively (Fig. 2). The comparison was also performed for the core proteome of E. coli, which served as a control. For the control, all the within-lineage rates except D1 were significantly smaller than the interlineage rate (Fig. 2; Mann-Whitney U tests with Bonferroni correction, P < 0.05). There were only two representative strains included for lineage D1, which could have led to the lack of significance. For the four porin proteins, the significance was generally lowered, with more than one lineage showing nonsignificant difference between intra- and interlineage phylogenetic distance measured by the substitution rates (Fig. 2; P > 0.05). The results suggested that the incongruent clustering patterns of the four porin proteins were more likely caused by gene recombination rather than lineage-associated spontaneous mutations. Like fhuA, the genomic loci for these genes were conserved, showing large collinearity and synteny among different strains of E. coli, suggesting that there were likely within-gene microbial recombination events rather than large-scale horizontal gene transfers that caused the incongruent clustering patterns observed for the genes (see Fig. S2).

FIG 2.

FIG 2

Genetic exchange testing of porin proteins or the core proteome among E. coli lineages. The phylogenetic distance among E. coli strains within each lineage was shown and compared to that among strains from different lineages, respectively. Mann-Whitney U tests were performed, and P values larger than 0.05 are highlighted in red. A-A, B1-B1, B2-B2, D1-D1, D2-D2, or E-E represent strain pairs from the same lineage: A, B1, B2, D1, D2, or E, respectively. X-Y represents pairs of strains from different lineages, e.g., A-B1, A-B2, D1-D2, etc. The lineage with only two representative strains is indicated with asterisk.

To further confirm that the genes have undergone local recombination, the DNA sequences were retrieved and recombination events were detected with different algorithms implemented in RDP4 (61). We identified significant recombination events among the Escherichia and Shigella strains for all the four genes (Fig. 3A). Despite a large variation of recombination events identified for some genes by different algorithms, the genes did show highly frequent local recombination, and multiple segments were identified with significant recombination events by multiple algorithms simultaneously (Fig. 3A). For representative segments within which recombination events potentially happened, codon-position-specific selection pressure was analyzed, among all strains or among those with relatively homogenous sequences, i.e., without local recombination yet. There were always codons detected under positive selection with dN/dS ratios significantly higher than 1 (Fig. 3B). Previous studies within small homogeneous E. coli populations (and therefore without recombination detected) also suggested local sites under positive selection (9, 10). Taken together, the results suggested that there was a mosaic evolution within these porin genes—paralleled local recombination and positive selection within recombination fragments.

FIG 3.

FIG 3

Detection of local recombination events at DNA level and their positive selection. (A) Detection of local recombination within porin genes with RDP4. The significant recombination events and their origin predicted are shown a schematic sequence display. The recombination fragments predicted by different algorithms integrated in RDP4 are shown in different colors, as indicated at the bottom. Each homologous recombination event could be predicted by different algorithms but is only shown for the results of one representative algorithm. For each of the porin genes, representative fragments are shown with the statistical P values for the corresponding algorithms predicting them to be recombination events. (B) Sites under positive selection within recombined fragments. Examples are shown for the porin genes with sites under positive selection within recombined fragments. The local observed fragments and the sites were indicated. The omega value (dN/dS) was estimated using PAML for each codon site with M8 model among all strains or the strains with the unique recombination pattern. The sites under significant positive selection (omega>1) are shown with asterisks in red (significant for both M2a and M8 models) or in blue (significant for only M8 model). The codon number is highlighted in red if the site was significantly under positive selection for both the tests among all strains and those among strains with a unique recombination pattern.

Sequence patterns of local recombination hot spots within E. coli porin proteins.

For each porin family, multiple sequence alignments demonstrated that the protein sequences did not vary evenly along the whole length. The variation hot spots often coincided with the recombination blocks predicted by the RDP4 algorithms. Therefore, an alignment-based calling strategy was applied to find the local highly variable regions (HVRs) in each porin family (see Materials and Methods).

Multiple sequence alignment showed that most positions of the LamB sequences were conserved for amino acid composition except for a 19-amino-acid (aa) region, P406-424 (Fig. 4A). The local HVR independently exhibited the clustering patterns exactly like that of the full-length LamB protein sequences (Fig. 4B), while the remaining regions in concatenation formed a phylogenetic tree that is generally consistent with the phylogenomic tree or cannot distinguish known E. coli lineages due to the high sequence conservation (Fig. 4C). The divergence among strains of the same lineage and cross-lineage clustering of the local variable protein sequences further suggested frequent local recombination for lamB genes among E. coli strains intuitively, which was further confirmed by the exact consistence of the HVR region with the most frequent recombination event predicted by different algorithms (Fig. 3A). According to the phylogenetic clustering results and sequence patterns of the P406-424 region, LamB was classified into four groups (LamB1 to LamB4) (Fig. 4A and B). Different groups showed apparent sequence diversity. Within each group, mutations also happened frequently which might caused amino acid changes, and might generate novel, positively selected patterns that broadened the space of local recombination (e.g., LamB2-1 and LamB2-2; see Fig. 4A and Fig. 3B).

FIG 4.

FIG 4

Sequence diversity of highly variable regions within E. coli porin proteins. The sequence patterns of highly variable regions (HVRs) of LamB, OmpA, OmpC, and OmpF are shown in panels A, D, G, and H, respectively. For OmpC and OmpF, the different patterns for each HVR are shown in different background colors. The NJ tree of the LamB HVR fragment and that of concatenated non-HVRs were shown in panels B and C, respectively. The OmpA tree based on the concatenated HVRs is shown in panel E. The trees are unrooted, and bootstrapping tests were performed with 1,000 replicates. Strains from the same lineage were labeled with a unique colored sign. The combination of HVR patterns for each OmpA phylogenetic group is shown in panel F.

Similar to LamB, the OmpA, OmpC, and OmpF proteins also showed the mosaic sequence pattern of conservation for major positions and diversity for local regions (Fig. 4D to H). OmpA, OmpC, and OmpF all had multiple HVRs (Fig. 4D, G, and H). Similarly, the concatenated HVR sequences of OmpA (or OmpC/F) generated a clustering tree like that derived from full-length protein sequences (Fig. 4E). Different HVRs of OmpA, OmpC, and OmpF showed varied sequence patterns and combined into different patterns, i.e., 3, 15, and 5 major groups for OmpA, OmpC, and OmpF, respectively (Fig. 4F to H). At the gene level, most of the HVR-encoding nucleotide sequences were also predicted as local recombination fragments at least by one of the algorithms included in RDP4. A single residue within OmpA (P46) was also listed as an HVR and considered an independent recombination site, since the position varied a lot among a limited set of amino acid species and showed apparent linkage with different sequential patterns of P87-89 and P129-135 (Fig. 4F). HVRs within each group also mutated, and subgroups could be formed by the positively selected mutations (Fig. 3B) or shorter-fragmental recombination events.

HVRs are located within the outer membrane convex loops.

Tertiary structure of the porin proteins of E. coli MG1655 was predicted by different methods. LamB, OmpC, and OmpF folded into the classical conformation of β-barrel outer membrane spanning proteins, while OmpA showed two separated domains connected by a long flexible loop and one of the domains formed the structure of a β-barrel (Fig. 5A). As observed in FhuA, most HVRs of E. coli LamB, OmpA, OmpC, and OmpF were located in the extracellular interfaces of the β-barrels, while the rest of the regions, i.e., transmembrane domains and cytoplasmic interfaces, were quite conserved (Fig. 5A).

FIG 5.

FIG 5

Structural locations of HVRs in E. coli porin proteins. (A) Tertiary structures of MG1655 LamB, OmpA, OmpC, and OmpF. The extracellular loops predicted by TMHMM are highlighted in different colors. HVRs are shown in spheres. (B) The transmembrane topology represents protein sequences of LamB, OmpA, OmpC, and OmpF. The topology was predicted with TMHMM. Each extracellular loop is indicated with “L” and the order number. The HVRs are highlighted in red.

Specifically, the HVRs were located within the ninth extracellular loop (L9) for LamB, L1/L2/L3 for OmpA, L1/L2/L4/L5/L7 for OmpC, and L1/L3/L4/L5 for OmpF, respectively (Fig. 5B). Nearly all of the HVRs were located at the apex of the convex loops (Fig. 5A). As exceptions, two HVRs, i.e., HVR5 of OmpC and HVR2 of OmpF, were located within the loops within the cytoplasmic interfaces (Fig. 5A and B).

Structural and functional relevance of the HVRs.

The E. coli porin proteins of different groups were modeled for the tertiary structure. For LamB, OmpC, or OmpF, different HVR groups showed largely homologous global topology (see Fig. S3). The two big domains of OmpA formed by the N-terminal 195-aa and C-terminal 140-aa protein sequences, respectively, did not show apparent overall difference among the HVR groups either, and yet the physical distance and relative location between the two major domains varied, which were determined by the angle formed by the long flexible loop connecting the domains (see Fig. S3).

While not affecting the overall protein structure, the extracellular surfaces formed by the HVRs showed apparent differences for each of the porin families (Fig. 6; see also Fig. 9). We hypothesize that the structural variations were related to other aspects of protein functions, perhaps for avoiding bacteriophage binding.

FIG 6.

FIG 6

Tertiary structures of E. coli LamB protein groups. The subregions of HVR1 and their sequences are indicated and highlighted in different colors. The local structures of the subregions of HVR1 are shown as colored spheres. The diagram at bottom shows the regional function of LamB.

FIG 9.

FIG 9

Tertiary structure of E. coli OmpF protein groups. The extracellular HVRs and their sequences are indicated and highlighted in different colors. The local structures of the HVRs are shown as colored spheres. The known binding sites of bacteriophages are indicated in the diagram.

In LamB, there was only one HVR (HVR1), comprised of 19 aa, which was subsequently divided into four sequential subregions, HVR1_1 through HVR1_4 (Fig. 6). HVR1_1 showed most striking sequence divergence, and HVR1_3 was the most conserved among all the LamB proteins. It was noted that while the overall interface formed by HVR1 varied, the region of HVR1_1 changed most extensively (Fig. 6). LamB is a malto-oligosaccharide-selective pore protein that binds and transports maltose and related bacterial energy source substances. It is also a receptor for the entry of various bacteriophages (2831). Based on previous reports, the maltose-binding sites are primarily located in the N-terminal ∼1/3 of the protein, while key sites required for bacteriophages entry are located in the C-terminal part of the protein (29, 30). HVR1 happened to locate in L9 of the C-terminal part of the protein (Fig. 6), where critical bacteriophage binding residues were reported to locate (29, 30). Fitting with this, the sequence and structural diversity within HVR1 could potentially be the result of bacteria adapting to avoid the binding and invasion of bacteriophages. No other major sequence changes were detected throughout LamB, including the whole β-barrel structure, preserving normal protein function for transport of maltose and other nutrients required for bacterial survival.

OmpA is the known receptor of Sf6 and related bacteriophages, and the L2 loop is critical for Sf6 binding (26, 32). The bi-residues “NI” in the Shigella flexneri HVR2 were confirmed to influence the efficiency of Sf6 entry (32). HVRs 1, 2, and 3 of OmpA happen to be located in L1, L2, and L3, respectively (Fig. 7), and form an interface with HVR2 in the center flanked by HVR1 and HVR3, diversifying strikingly for the three groups of E. coli OmpA (Fig. 7). The sequence of HVRs 1 to 3 of E. coli UMNK88 OmpA are identical to that of S. flexneri (Fig. 2B). The HVR2 region appeared protruding in UMNK88 but flat and at the same horizontal line with HVR1 and HVR3 in Shigella dysenteriae Sd197, where residues “DNI” in UMNK88 were replaced with “SVE” (Fig. 7). In E. coli MG1655, significant mutations in HVR1 and HVR3 appeared to cause the HVR2 region to protrude more strikingly (Fig. 7). Again, we hypothesize that these variations could be associated with or arise because of the interactions with bacteriophages.

FIG 7.

FIG 7

Tertiary structure of E. coli OmpA protein groups. The HVRs and their sequences are indicated and highlighted in different colors. The locals structure of the HVRs are shown as colored spheres. The known binding sites of bacteriophages are indicated in the diagram.

The loops L1, L4, and L5 in OmpC, corresponding to HVR1, HVR3/HVR4, and HVR6, respectively (Fig. 5), were essential for attachment of phage T4 and also maybe for GH-K3 and S16 (3335). L4 (HVR3 and HVR4) is also an important binding site for other OmpC-specific phages, e.g., Tu1b, SS4, and Hy2 for instance (36). In addition to HVR1, HVR3, HVR4, and HVR6, there are other extracellular regions, i.e., HVR2 located in L2 and HVR7 in L7 (Fig. 5), all of which showed great sequence diversity (Fig. 4). Mutations in individual HVRs independently caused microstructure variations, and the combination of diverse sequence patterns in each HVR led to wide and extensive changes of OmpC extracellular interface (Fig. 8).

FIG 8.

FIG 8

Tertiary structure of E. coli OmpC protein groups. The sequences of extracellular HVRs are highlighted in different colors. The local structures of the HVRs are shown as colored spheres.

OmpF plays a central role during the colicin uptake by sensitive E. coli cells. It is also an important bacteriophage receptor, mainly responsible for the transport of T2-like and other phages. Mutations in loops L5 (HVR4) and L6 (HVR5) have previously been shown to block the binding of phage K20 to E. coli K-12 without altering the measured OmpF channel activity (25). Structural modeling showed that HVR5, and to a lesser extent HVR4, led to significant surface changes among different OmpF groups (Fig. 9). The domain volume and interface formed by OmpF HVR5 varied a lot since there was frequent gain or loss involving multiple residues in the regions, and additional substitutions (Fig. 9). In OmpF proteins of E. coli MG1655 and SE15, HVR4 and HVR5 were contiguous, while these same HVRs were separated with a cleft in E. coli ATCC 8739 and separated even further away in E. coli E234869 and CE10 (Fig. 9). The conformation changes of both individual HVRs and the whole interface could be associated with the binding and entry efficiency of bacteriophages.

Taken together, the results demonstrated that each of the four porin protein families had structural changes in the HVR regions without altering the overall channel topology. Most of the HVRs were in regions known to contain critical recognition sites for various bacteriophages, and therefore the diversity in HVRs could potentially alter the resistance of strains to bacteriophage attachment and invasion. It is assumed that the conservation of global structure, cytoplasmic interfaces, and other extracellular surfaces preserves the biological function of each channel, ensuring transporter and binding activity for substances useful for bacterial survival.

DISCUSSION

Previously, we reported that fhuA, which encodes a siderophore transporter, had an extraordinary evolutionary pattern, with a combination of negative selection, positive selection, and especially local recombination simultaneously (8). The kind of within-molecule recombination event is difficult to detect with established tools, which was also influenced by the size and variation of strain population studied; therefore, fhuA and other genes were reported to be under positive selection, but no recombination was identified (9, 10). In the present study, we broadened our analysis to other β-barrel porin proteins that were also reported to be under positive selection (9, 10). We identified at least four additional porin-encoding genes (i.e., lamB, ompA, ompC, and ompF) that followed evolutionary routes similar to that for fhuA. All of the genes were identified with local recombination, which was confirmed by multiple approaches in phylogeny analysis and genetic exchange tests at both protein and DNA levels (Fig. 1 and 3). Like fhuA, all the four porin genes also showed mosaic patterns with both recombination and positive selection (Fig. 3B) (9, 10). We identified several additional proteins that may be similar but were excluded since their local-recombination patterns were not as typical as FhuA or the other four porin proteins. Alternative to the recombination hypothesis, convergent mutations could also have happened in these genes among E. coli lineages, and structure constraints could have facilitated the processes (37). However, the multiple “convergence” patterns within a family of proteins among the strains from different lineages supported the more likely local sequence recombination events. Such local recombination patterns have also been identified within multiple porins in other species, e.g., OprD in Pseudomonas (38), PorA and PorB in Neisseria (3941), OmpL1 in Leptospira (42), OmpA in Chlamydia and Wolbachia (13, 43), and OmpF in Yersinia (14). Also, local genetic exchange events were also identified from bacterial adhesins and other outer membrane β-barrel proteins (4446). Positive selection driven by point mutations was also identified within these local recombination blocks. The mosaic evolution could represent a common route undergone by bacterial outer membrane proteins.

Besides the mosaic patterns of simultaneous local recombination and positive selection for the single regions, the proteins studied in this study or elsewhere also showed a mosaic pattern at molecular level, that is, a combination of different fragments showing different evolution processes, e.g., recombination, positive selection, and/or purifying selection. The proteins analyzed in our study or described elsewhere as having mosaic evolution are mainly multifunctional outer membrane proteins. Typically, the proteins have important functions, for example, the import of nutrient or energy source substances essential for bacterial survival (13, 28, 47). Phages, virulence factors (such as colicins and microcins deployed by competitors), antibiotics, or antibodies can also hijack these proteins to enter and kill the recipient bacteria (26, 31, 48). For optimum bacterial fitness, these proteins must balance an escape from recognition by the harmful factors while retaining the capability of being recognized and bound by the essential substances and the transport activity. Tracing the evolutionary patterns within these proteins, the fragments involved in transport activity and the binding sites of substances essential for bacterial survival should be conserved, while the recognition sites of disfavored factors should be under positive selection and mutate extensively (8, 14, 29). Consistent with the hypothesis, all the porin proteins studied in this study showed globally conserved sequences and structures interspersed with small HVRs with extensive sequential and structural diversity. Within LamB, OmpA, OmpC, and OmpF, nearly all the reported critical phage-binding sites coincided with the identified, mosaic HVRs, and the known recognition sites of essential substrates or regions involved in transport activity were seldom located in the HVRs or showed structural variations. The type of mosaic evolution has also been identified from multicellular eukaryotic organisms, although different evolutionary mechanisms could be involved. For example, there are similar multifunctional receptor proteins usurped by fatal viruses in human or other animals, e.g., ACE2 bound by SARS-CoV-2-related viruses, the coreceptor of HIV-1 and Yersinia pestis CCR5, etc., and local diversities with functional relevance have been identified from these proteins (49, 50).

Homologous recombination is common within bacteria in natural environment and happens frequently (51, 52). This could increase the fixation probability of adaptive mutations and decrease the probability of fixation of deleterious mutations and therefore facilitate adaptive evolution (53). The positive correlation between recombination and adaptation has been demonstrated in multiple bacterial species, such as the rhizobial soil bacteria (6) and other species (4, 5, 54, 55). When a bacterial population encounters the attack of phages or other harmful substances, both genetic recombination and spontaneous mutations could happen that help the population survive from the disfavored environment. Recombination could benefit the bacteria more efficiently since the potential recombined polymorphic gene fragments from other bacteria have experienced a long selective and fitting process. If homologous recombination in individual bacteria involving the gene or local fragments happens, such strains could probably have the fitting advantages over the positively selected ones mediated by random mutations (56). However, mutations and their mediated positive selection continue for better adaptation of the bacterial population living in the environment. Meanwhile, the spontaneous mutations create new polymorphic forms of gene sequences, further enlarging the pool for recombination.

We predicted the structure for the different groups of porin proteins using state-of-the-art methods, including a classical homology modeling-based method and another ab initio one proposed most recently that was reported with best performance at present (57, 58). Except for the angles between the two major domains in OmpA formed by the long flexible coil, the global structure for each porin family was conserved and consistent between the prediction results of different prediction methods. Domain and local structure of the proteins predicted by the two methods were generally consistent and therefore reliable. Similar to the known sites critical for the entry of known phages, we identified here more and longer HVRs in most of the porin proteins, which also showed extensive sequence and local structure changes. Meanwhile, there is likely a phage repertoire larger than what is currently known that takes the porin proteins as receptors for entry into host cells. Other unknown toxins, antibiotics, or immune surveillance could also target these porins. Therefore, the porins under mosaic evolution and the HVRs identified in the study could facilitate studies on new phage-bacterium coevolutionary mechanisms, the identification of their possible critical binding sites, and the development of new antibacterial drugs or treatment regimens.

MATERIALS AND METHODS

Bacterial strains, genomes, and protein sequences.

The genes reported under positive selection in E. coli were noted down from previous studies and merged (9, 10), and the protein sequences from the model strain MG1655 were downloaded from NCBI protein database. The accessions for the genes, proteins and strain genomes used in this study were listed in Data Set S1. Representative strains for major E. coli lineages, their lineage information, genome sequences, genome annotation files, genome-derived proteome sequences, core genomes, and its derived proteome were retrieved from our previous study (8). A representative strain of E. fergusonii, ATCC 35469, was used as an outgroup or control, and its genome or proteome information was also downloaded. As before, Shigella species were considered phylogenetic subbranches of E. coli (59).

Phylogenetic analysis.

Amino acid sequences were aligned with ClustalW (https://myhits.sib.swiss/cgi-bin/clustalw) using the default parameters. Phylogenetic trees were built with both neighbor-joining (NJ) and maximum likelihood (ML) methods implemented in MEGA 6.0 (60). Only the NJ trees were illustrated in the study because the NJ and ML trees appeared to be consistent between each other for all of the studied proteins. The Jones-Taylor-Thornton model for amino acid substitutions was used for ML tree building. Bootstrapping tests were performed for 1,000 replicates to assess the robustness of each node of a tree; only branches with ≥50 percentages of the subtrees were considered stable. The core proteome trees of E. coli were rebuilt according to previously described methods (8).

Genetic exchange testing.

To assist screening the ingenic local genetic exchange events, a simplified hypothesis-based testing was proposed. “H0” indicates No local genetic exchange happened among strains from different lineages; “H1” indicates local genetic exchanges happened among strains from different lineages. Suppose H0 holds. The phylogenetic distance among strains from the same lineage should be significantly smaller than the distance among the strains from different lineages, as could be proved in the first place. However, if the intralineage distance is larger than or not significantly different from the interlineage distance, the starting hypothesis (H0) would be rejected, and the optional hypothesis (H1) is accepted. In the study, phylogenetic distance was measured as the adjusted substitution rate of amino acids. The E. coli core genome derived proteome was used as control to prove the smaller intralineage distance than interlineage distance. Representative E. coli strains were selected, followed by calculation of the phylogenetic distance for porin proteins or core proteome between each pair of strains of the same lineage (intralineage) or from different lineages (interlineage). Mann-Whitney U tests were performed to compare the difference between intralineage and interlineage distance. Bonferroni corrections were performed to determine the P values, since multiple testing was involved. For all the statistic testing, the threshold of type I error was preset as 0.05.

Detection of local recombination events and positive selection sites at DNA level.

RDP4 was used for detection and visualization of recombination events within gene sequences from a population of bacterial strains (61). The detected recombination events were illustrated with a mode of schematic sequence display. RDP4 integrated nine different statistics-based methods, all of which were applied for detection of recombination events in the porin genes (61). A recombination event was defined only if it was tested as significant at least by one method. The significance level was preset as 0.05. To further observe whether there were codons under positive selection within each local recombination region, the codon sequences for each recombination region in all the strains without insertions/deletions or in a specific group of bacteria with the same local recombination pattern were retrieved, respectively, and aligned with a codon mode, and the omega (dN/dS) was estimated with PAML (version 4.9j) for each codon (62). The omega values were estimated and tested for the Naive Empirical Bayes probabilities under both M2a and M8 models. A positively selected site was detected if omega is larger than 1 significantly. The significance level was preset as 0.05.

Definition of highly variable regions.

The HVRs were defined empirically according to the multiple sequence alignment results of proteins sequences for each family of the porin genes. The full-length protein sequences were aligned with ClustalW, and candidate HVRs were defined as the continuous blocks with apparent mutations or insertions/deletions, or highly variable blocks with apparent linkage of sequence patterns, which were separated by a small conserved block normally with no larger than 5 aa. The recombination events detected with RDP4 (61) were also compared, and most of the HVRs were consistent with or located in the local recombination regions detected by the methods integrated in RDP4.

Structure modeling.

The transmembrane topology of LamB, OmpA, OmpC, and OmpF was predicted with a Hidden Markov Model (HMM) based method, TMHMM 2.0 (63). The tertiary structure was predicted with Phyre2 by homology modeling (57) and RoseTTAFold ab intio (58). In addition to the predicted accuracy or resolution estimated by the tools themselves, the results predicted with the two methods for each protein were compared between each other. The prediction was considered reliable only if the results of the two methods appeared consistent with each other. For each protein, the coordinates of atoms for each amino acid were recorded in PDB format, and the three-dimensional structure was illustrated with PyMOL (https://pymol.org/2/).

ACKNOWLEDGMENTS

This study was supported by a Natural Science Fund of Shenzhen (JCYJ20190808165205582) to Y.W. Z.C. was supported by a Fund for the Cultivation of Guangdong College Students’ Scientific and Technological Innovation, Climbing Program (pdjh2021b0432). X.Ca. was supported by an Undergraduate Training Program for Innovation and Entrepreneurship of Shenzhen University (no.127) and an Undergraduate Science and Technology Innovation Program of Shenzhen University.

Y.W., Y.C., A.P.W., W.K., and G.Z. conceived the project. X.Ch., Z.C., S.L., and J.Z. collected the data. X.Ch., X.Ca., Z.C., J.W., S.L., J.Z., Y.H., and Y.W. performed the analysis. G.H. and Q.L. participated in the discussion. X.Ch., X.Ca., Z.C., G.H., Q.L., and Y.W. wrote the first draft. A.P.W., W.K., G.Z., Y.C., and Y.W. revised the manuscript. All the authors approved the final version of manuscript.

Footnotes

Supplemental material is available online only.

Supplemental file 2
Data Set S1. Download aem.00060-22-s0001.xlsx, XLSX file, 0.02 MB (16.9KB, xlsx)
Supplemental file 1
Fig. S1 to S3. Download aem.00060-22-s0002.pdf, PDF file, 2.7 MB (2.7MB, pdf)

Contributor Information

Yi Cai, Email: caiyi0113@szu.edu.cn.

Yejun Wang, Email: wangyj@szu.edu.cn.

Robert M. Kelly, North Carolina State University

REFERENCES

  • 1.Didelot X, Maiden MCJ. 2010. Impact of recombination on bacterial evolution. Trends Microbiol 18:315–322. 10.1016/j.tim.2010.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Castillo-Ramirez S, Harris SR, Holden MTG, He M, Parkhill J, Bentley SD, Feil EJ. 2011. The impact of recombination on dN/dS within recently emerged bacterial clones. PLoS Pathog 7:e1002129. 10.1371/journal.ppat.1002129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.González-Torres P, Rodríguez-Mateos F, Antón J, Gabaldón T. 2019. Impact of homologous recombination on the evolution of prokaryotic core genomes. mBio 10:e02494-18. 10.1128/mBio.02494-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Cohen E, Kessler DA, Levine H. 2005. Recombination dramatically speeds up evolution of finite populations. Phys Rev Lett 94:e098102. 10.1103/PhysRevLett.94.098102. [DOI] [PubMed] [Google Scholar]
  • 5.Levin BR, Cornejo OE. 2009. The population and evolutionary dynamics of homologous gene recombination in bacterial populations. PLoS Genet 5:e1000601. 10.1371/journal.pgen.1000601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Cavassim MIA, Andersen SU, Bataillon T, Schierup MH. 2021. Recombination facilitates adaptive evolution in rhizobial soil bacteria. Mol Biol Evol 38:5480–5490. 10.1093/molbev/msab247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Yahara K, Didelot X, Jolley KA, Kobayashi I, Maiden MCJ, Sheppard SK, Falush D. 2016. The landscape of realized homologous recombination in pathogenic bacteria. Mol Biol Evol 33:456–471. 10.1093/molbev/msv237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wang Y, Chen X, Hu Y, Zhu G, White AP, Köster W. 2018. Evolution and sequence diversity of FhuA in Salmonella and Escherichia. Infect Immun 86:e00573-18. 10.1128/IAI.00573-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Chen SL, Hung CS, Xu J, Reigstad CS, Magrini V, Sabo A, Blasiar D, Bieri T, Meyer RR, Ozersky P, Armstrong JR, Fulton RS, Latreille JP, Spieth J, Hooton TM, Mardis ER, Hultgren SJ, Gordon JI. 2006. Identification of genes subject to positive selection in uropathogenic strains of Escherichia coli: a comparative genomics approach. Proc Natl Acad Sci USA 103:5977–5982. 10.1073/pnas.0600938103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Petersen L, Bollback JP, Dimmic M, Hubisz M, Nielsen R. 2007. Genes under positive selection in Escherichia coli. Genome Res 17:1336–1343. 10.1101/gr.6254707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Rollauer SE, Sooreshjani MA, Noinaj N, Buchanan SK. 2015. Outer membrane protein biogenesis in Gram-negative bacteria. Philos Trans R Soc B 370:20150023. 10.1098/rstb.2015.0023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Kleinschmidt JH. 2015. Folding of β-barrel membrane proteins in lipid bilayers: unassisted and assisted folding and insertion. Biochim Biophys Acta 1848:1927–1943. 10.1016/j.bbamem.2015.05.004. [DOI] [PubMed] [Google Scholar]
  • 13.Millman KL, Tavaré S, Dean D. 2001. Recombination in the ompA gene but not the omcB gene of Chlamydia contributes to serovar-specific differences in tissue tropism, immune surveillance, and persistence of the organism. J Bacteriol 183:5997–6008. 10.1128/JB.183.20.5997-6008.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Stenkova AM, Isaeva MP, Shubin FN, Rasskazov VA, Rakin AV. 2011. Trends of the major porin gene (ompF) evolution: insight from the genus Yersinia. PLoS One 6:e20546. 10.1371/journal.pone.0020546. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lopez-Barbosa N, Suárez-Arnedo A, Cifuentes J, Barrios AFG, Batista CAS, Osma JF, Muñoz-Camargo C, Cruz JC. 2020. Magnetite–OmpA nanobioconjugates as cell-penetrating vehicles with endosomal escape abilities. ACS Biomater Sci Eng 6:415–424. 10.1021/acsbiomaterials.9b01214. [DOI] [PubMed] [Google Scholar]
  • 16.Confer AW, Ayalew S. 2013. The OmpA family of proteins: roles in bacterial pathogenesis and immunity. Vet Microbiol 163:207–222. 10.1016/j.vetmic.2012.08.019. [DOI] [PubMed] [Google Scholar]
  • 17.Krishnan S, Prasadarao NV. 2012. Outer membrane protein A and OprF: versatile roles in Gram-negative bacterial infections. FEBS J 279:919–931. 10.1111/j.1742-4658.2012.08482.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Pore D, Chakrabarti MK. 2013. Outer membrane protein A (OmpA) from Shigella flexneri 2a: a promising subunit vaccine candidate. Vaccine 31:3644–3650. 10.1016/j.vaccine.2013.05.100. [DOI] [PubMed] [Google Scholar]
  • 19.Futse JE, Buami G, Kayang BB, Koku R, Palmer GH, Graça T, Noh SM. 2019. Sequence and immunologic conservation of Anaplasma marginale OmpA within strains from Ghana as compared to the predominant OmpA variant. PLoS One 14:e0217661. 10.1371/journal.pone.0217661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Bekhit A, Fukamachi T, Saito H, Kobayashi H. 2011. The role of OmpC and OmpF in acidic resistance in Escherichia coli. Biol Pharm Bull 34:330–334. 10.1248/bpb.34.330. [DOI] [PubMed] [Google Scholar]
  • 21.Misra R, Benson SA. 1988. Isolation and characterization of OmpC porin mutants with altered pore properties. J Bacteriol 170:528–533. 10.1128/jb.170.2.528-533.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Yoshimura F, Nikaido H. 1985. Diffusion of beta-lactam antibiotics through the porin channels of Escherichia coli K-12. Antimicrob Agents Chemother 27:84–92. 10.1128/AAC.27.1.84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Bafna JA, Sans-Serramitjana E, Acosta-Gutiérrez S, Bodrenko IV, Hörömpöli D, Berscheid A, Brötz-Oesterhelt H, Winterhalter M, Ceccarelli M. 2020. Kanamycin uptake into Escherichia coli is facilitated by OmpF and OmpC porin channels located in the outer membrane. ACS Infect Dis 6:1855–1865. 10.1021/acsinfecdis.0c00102. [DOI] [PubMed] [Google Scholar]
  • 24.Chetri S, Singha M, Bhowmik D, Nath K, Chanda DD, Chakravarty A, Bhattacharjee A. 2019. Transcriptional response of OmpC and OmpF in Escherichia coli against differential gradient of carbapenem stress. BMC Res Notes 12:138. 10.1186/s13104-019-4177-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Traurig M, Misra R. 1999. Identification of bacteriophage K20 binding regions of OmpF and lipopolysaccharide in Escherichia coli K-12. FEMS Microbiol Lett 181:101–108. 10.1111/j.1574-6968.1999.tb08831.x. [DOI] [PubMed] [Google Scholar]
  • 26.Parent KN, Erb ML, Cardone G, Nguyen K, Gilcrease EB, Porcek NB, Pogliano J, Baker TS, Casjens SR. 2014. OmpA and OmpC are critical host factors for bacteriophage Sf6 entry in Shigella. Mol Microbiol 92:47–60. 10.1111/mmi.12536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Chen P, Sun H, Ren H, Liu W, Li G, Zhang C. 2020. LamB, OmpC, and the core lipopolysaccharide of Escherichia coli K-12 function as receptors of bacteriophage Bp7. J Virol 94:e00325-20. 10.1128/JVI.00325-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Szmelcman S, Hofnung M. 1975. Maltose transport in Escherichia coli K-12: involvement of the bacteriophage lambda receptor. J Bacteriol 124:112–118. 10.1128/jb.124.1.112-118.1975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Desaymard C, Débarbouillé M, Jolit M, Schwartz M. 1986. Mutations affecting antigenic determinants of an outer membrane protein of Escherichia coli. EMBO J 5:1383–1388. 10.1002/j.1460-2075.1986.tb04371.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Heine HG, Kyngdon J, Ferenci T. 1987. Sequence determinants in the lamB gene of Escherichia coli influencing the binding and pore selectivity of maltoporin. Gene 53:287–292. 10.1016/0378-1119(87)90018-7. [DOI] [PubMed] [Google Scholar]
  • 31.Chatterjee S, Rothenberg E. 2012. Interaction of bacteriophage λ with its Escherichia coli receptor, LamB. Viruses 4:3162–3178. 10.3390/v4113162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Porcek NB, Parent KN. 2015. Key residues of Shigella flexneri OmpA mediate infection by bacteriophage Sf6. J Mol Biol 427:1964–1976. 10.1016/j.jmb.2015.03.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Suga A, Kawaguchi M, Yonesaki T, Otsuka Y. 2021. Manipulating interactions between T4 phage long tail fibers and Escherichia coli receptors. Appl Environ Microbiol 87:e0042321. 10.1128/AEM.00423-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Marti R, Zurfluh K, Hagens S, Pianezzi J, Klumpp J, Loessner MJ. 2013. Long tail fibres of the novel broad-host-range T-even bacteriophage S16 specifically recognize Salmonella OmpC. Mol Microbiol 87:818–834. 10.1111/mmi.12134. [DOI] [PubMed] [Google Scholar]
  • 35.Cai R, Wu M, Zhang H, Zhang Y, Cheng M, Guo Z, Ji Y, Xi H, Wang X, Xue Y, Sun C, Feng X, Lei L, Tong Y, Liu X, Han W, Gu J. 2018. A smooth-type, phage-resistant Klebsiella pneumoniae mutant strain reveals that OmpC is indispensable for infection by phage GH-K3. Appl Environ Microbiol 84:e01585-18. 10.1128/AEM.01585-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Vakharia H, Misra R. 1996. A genetic approach for analyzing surface-exposed regions of the OmpC protein of Escherichia coli K-12. Mol Microbiol 19:881–889. 10.1046/j.1365-2958.1996.430957.x. [DOI] [PubMed] [Google Scholar]
  • 37.Hu Y, Huang H, Hui X, Cheng X, White AP, Zhao Z, Wang Y. 2016. Distribution and evolution of Yersinia leucine-rich repeat proteins. Infect Immun 84:2243–2254. 10.1128/IAI.00324-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Chevalier S, Bodilis J, Jaouen T, Barray S, Feuilloley MG, Orange N. 2007. Sequence diversity of the OprD protein of environmental Pseudomonas strains. Environ Microbiol 9:824–835. 10.1111/j.1462-2920.2006.01191.x. [DOI] [PubMed] [Google Scholar]
  • 39.Derrick JP, Urwin R, Suker J, Feavers IM, Maiden MC. 1999. Structural and evolutionary inference from molecular variation in Neisseria porins. Infect Immun 67:2406–2413. 10.1128/IAI.67.5.2406-2413.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Posada D, Crandall KA, Nguyen M, Demma JC, Viscidi RP. 2000. Population genetics of the porB gene of Neisseria gonorrhoeae: different dynamics in different homology groups. Mol Biol Evol 17:423–436. 10.1093/oxfordjournals.molbev.a026322. [DOI] [PubMed] [Google Scholar]
  • 41.Urwin R, Holmes EC, Fox AJ, Derrick JP, Maiden MC. 2002. Phylogenetic evidence for frequent positive selection and recombination in the meningococcal surface antigen PorB. Mol Biol Evol 19:1686–1694. 10.1093/oxfordjournals.molbev.a003991. [DOI] [PubMed] [Google Scholar]
  • 42.Haake DA, Suchard MA, Kelley MM, Dundoo M, Alt DP, Zuerner RL. 2004. Molecular evolution and mosaicism of leptospiral outer membrane proteins involves horizontal DNA transfer. J Bacteriol 186:2818–2828. 10.1128/JB.186.9.2818-2828.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Baldo L, Desjardins CA, Russell JA, Stahlhut JK, Werren JH. 2010. Accelerated microevolution in an outer membrane protein (OMP) of the intracellular bacterium Wolbachia. BMC Evol Biol 10:48. 10.1186/1471-2148-10-48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Falush D, Kraft C, Taylor NS, Correa P, Fox JG, Achtman M, Suerbaum S. 2001. Recombination and mutation during long-term gastric colonization by Helicobacter pylori: estimates of clock rates, recombination size, and minimal age. Proc Natl Acad Sci USA 98:15056–15061. 10.1073/pnas.251396098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Nell S, Kennemann L, Schwarz S, Josenhans C, Suerbaum S. 2014. Dynamics of Lewis b binding and sequence variation of the babA adhesin gene during chronic Helicobacter pylori infection in humans. mBio 5:e02281-14. 10.1128/mBio.02281-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Kumar S, Caimano MJ, Anand A, Dey A, Hawley KL, LeDoyt ME, La Vake CJ, Cruz AR, Ramirez LG, Paštěková L, Bezsonova I, Šmajs D, Salazar JC, Radolf JD. 2018. Sequence variation of rare outer membrane protein beta-barrel domains in clinical strains provides insights into the evolution of Treponema pallidum subsp. pallidum, the syphilis spirochete. mBio 9:e01006-18. 10.1128/mBio.01006-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Hantke K, Braun V. 1975. Membrane receptor dependent iron transport in Escherichia coli. FEBS Lett 49:301–305. 10.1016/0014-5793(75)80771-x. [DOI] [PubMed] [Google Scholar]
  • 48.Braun V. 2009. FhuA (TonA), the career of a protein. J Bacteriol 191:3431–3436. 10.1128/JB.00106-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Guo H, Hu BJ, Yang XL, Zeng LP, Li B, Ouyang S, Shi ZL. 2020. Evolutionary arms race between virus and host drives genetic diversity in bat severe acute respiratory syndrome-related coronavirus spike genes. J Virol 94:e00902-20. 10.1128/JVI.00902-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Mummidi S, Ahuja SS, Gonzalez E, Anderson SA, Santiago EN, Stephan KT, Craig FE, O’Connell P, Tryon V, Clark RA, Dolan MJ, Ahuja SK. 1998. Genealogy of the CCR5 locus and chemokine system gene variants associated with altered rates of HIV-1 disease progression. Nat Med 4:786–793. 10.1038/nm0798-786. [DOI] [PubMed] [Google Scholar]
  • 51.Guttman DS, Dykhuizen DE. 1994. Clonal divergence in Escherichia coli as a result of recombination, not mutation. Science 266:1380–1383. 10.1126/science.7973728. [DOI] [PubMed] [Google Scholar]
  • 52.Vos M, Didelot X. 2009. A comparison of homologous recombination rates in bacteria and archaea. ISME J 3:199–208. 10.1038/ismej.2008.93. [DOI] [PubMed] [Google Scholar]
  • 53.McVean GA, Charlesworth B. 2000. The effects of Hill-Robertson interference between weakly selected mutations on patterns of molecular evolution and variation. Genetics 155:929–944. 10.1093/genetics/155.2.929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Cooper TF. 2007. Recombination speeds adaptation by reducing competition between beneficial mutations in populations of Escherichia coli. PLoS Biol 5:e225. 10.1371/journal.pbio.0050225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Baltrus DA, Guillemin K, Phillips PC. 2008. Natural transformation increases the rate of adaptation in the human pathogen Helicobacter pylori. Evolution 62:39–49. 10.1111/j.1558-5646.2007.00271.x. [DOI] [PubMed] [Google Scholar]
  • 56.Vos M. 2009. Why do bacteria engage in homologous recombination? Trends Microbiol 17:226–232. 10.1016/j.tim.2009.03.001. [DOI] [PubMed] [Google Scholar]
  • 57.Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJE. 2015. The Phyre2 web portal for protein modeling, prediction, and analysis. Nat Protoc 10:845–858. 10.1038/nprot.2015.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, Wang J, Cong Q, Kinch LN, Schaeffer RD, Millán C, Park H, Adams C, Glassman CR, DeGiovanni A, Pereira JH, Rodrigues AV, van Dijk AA, Ebrecht AC, Opperman DJ, Sagmeister T, Buhlheller C, Pavkov-Keller T, Rathinaswamy MK, Dalwadi U, Yip CK, Burke JE, Garcia KC, Grishin NV, Adams PD, Read RJ, Baker D. 2021. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373:871–876. 10.1126/science.abj8754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Touchon M, Hoede C, Tenaillon O, Barbe V, Baeriswyl S, Bidet P, Bingen E, Bonacorsi S, Bouchier C, Bouvet O, Calteau A, Chiapello H, Clermont O, Cruveiller S, Danchin A, Diard M, Dossat C, Karoui ME, Frapy E, Garry L, Ghigo JM, Gilles AM, Johnson J, Le Bouguénec C, Lescat M, Mangenot S, Martinez-Jéhanne V, Matic I, Nassif X, Oztas S, Petit MA, Pichon C, Rouy Z, Ruf CS, Schneider D, Tourret J, Vacherie B, Vallenet D, Médigue C, Rocha EP, Denamur E. 2009. Organized genome dynamics in the Escherichia coli species results in highly diverse adaptive paths. PLoS Genet 5:e1000344. 10.1371/journal.pgen.1000344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. 2013. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol 30:2725–2729. 10.1093/molbev/mst197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Martin DP, Murrell B, Golden M, Khoosal A, Muhire B. 2015. RDP4: detection and analysis of recombination patterns in virus genomes. Virus Evol 1:1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Yang Z. 2007. PAML 4: a program package for phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586–1591. 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  • 63.Krogh A, Larsson B, von Heijne G, Sonnhammer ELL. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580. 10.1006/jmbi.2000.4315. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental file 2

Data Set S1. Download aem.00060-22-s0001.xlsx, XLSX file, 0.02 MB (16.9KB, xlsx)

Supplemental file 1

Fig. S1 to S3. Download aem.00060-22-s0002.pdf, PDF file, 2.7 MB (2.7MB, pdf)


Articles from Applied and Environmental Microbiology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES