Skip to main content
mSystems logoLink to mSystems
. 2023 May 24;8(3):e00008-23. doi: 10.1128/msystems.00008-23

Flotillin-associated rhodopsin (FArhodopsin), a widespread paralog of proteorhodopsin in aquatic bacteria with streamlined genomes

Jose M Haro-Moreno 1,2, Mario López-Pérez 1, Alexey Alekseev 3, Elizaveta Podoliak 4, Kirill Kovalev 5, Valentin Gordeliy 2,6, Ramunas Stepanauskas 7, Francisco Rodriguez-Valera 1,
Editor: Steven J Hallam8
PMCID: PMC10308929  PMID: 37222519

ABSTRACT

Microbial rhodopsins are found more than once in a single genome (paralogs) often have different functions. We screened a large dataset of open ocean single-amplified genomes (SAGs) for co-occurrences of multiple rhodopsin genes. Many such cases were found among Pelagibacterales (SAR11), HIMB59, and the Gammaproteobacteria Pseudothioglobus SAGs. These genomes always had a bona fide proteorhodopsin and a separate cluster of genes containing a second rhodopsin associated with a predicted flotillin coding gene and have thus been named flotillin-associated rhodopsins (FArhodopsins). Although they are members of the proteorhodopsin protein family, they form a separate clade within that family and are quite divergent from known proton-pumping proteorhodopsins. They contain either DTT, DTL, or DNI motifs in their key functional amino acids. FArhodopsins are mainly associated with the lower layers of the epipelagic zone. All marine FArhodopsins had the retinal binding lysine, but we found relatives in freshwater metagenomes lacking this key amino acid. AlphaFold predictions of marine FArhodopsins indicate that their retinal pocket might be very reduced or absent, hinting that they are retinal-less. Freshwater FArhodopsins were more diverse than marine ones, but we could not determine if there were other rhodopsins in the genome due to the lack of SAGs or isolates. Although the function of FArhodopsins could not be established, their conserved genomic context indicated involvement in the formation of membrane microdomains. The conservation of FArhodopsins in diverse and globally abundant microorganisms suggests that they may be important in the adaptation to the twilight zone of aquatic environments.

IMPORTANCE

Rhodopsins have been shown to play a key role in the ecology of aquatic microbes. Here, we describe a group of widespread rhodopsins in aquatic microbes associated with dim light conditions. Their characteristic genomic context found in both marine and freshwater environments indicates a novel potential involvement in membrane microstructure that could be important for the function of the coexisting proteorhodopsin proton pumps. The absence or reduction of the retinal binding pocket points to a drastically different physiological role.

KEYWORDS: metagenomics, Pelagibacterales, rhodopsin, rhodopsin paralogs, single-amplified genomes, streamlined genomes

INTRODUCTION

Rhodopsins are probably the most universal biological light-energy transducers. They are found in all the domains of life (eukaryotes, bacteria, and archaea) and also in viruses (1). Although they all share similar 3D structures and active sites that involve a retinal cofactor, they are extremely diverse in sequence (2, 3) and are classified into three homology groups. Type-1 are microbial rhodopsins found in prokaryotes and protists and have been shown to play a key role in the ecology of aquatic microbes by carrying out different light-related functions (4, 5), type-2 are animal-eye rhodopsins, and a third kind, type-3 (heliorhodopsins), have been recently described (6, 7), and they are also widespread in microbes and their function remains largely unknown although some clues have been recently reported (8 - 10). All have in common the retinal chromophore, although retinal-less rhodopsins are known (11, 12).

It is now well established that microbial rhodopsins (like microbes themselves) are barely represented by culture collections. Most of their diversity is found in the vast reservoir of uncultivated microbes present in the ocean and other aquatic environments exposed to light (5, 13 - 16). Microbial type-1 rhodopsins were initially discovered in the halophilic archaeon Halobacterium salinarum (17), which contains four paralogous type-1 rhodopsin genes coding for different functions: H+ or Cl- pumping, or light sensing with two different light ranges (18). In haloarchaea, the most common case is that more than one rhodopsin gene is found per genome reaching up to six described for Haloarcula marismortui (19). In other aquatic bacteria, the presence of multiple rhodopsin genes (paralogs) in the genome is much rarer, and only a few marine bacteria, such as Dokdonia eikasta, Gillisia limnaea, or Nonlabens marinus, encode for, in addition to the regular proton-pumping rhodopsin, a sodium-pump and a chloride-pump (20 - 22). One potential reason for such scarcity is that many pelagic aquatic microbes have streamlined genomes that are depleted in paralog genes. Besides, they tend to be hard to grow and are largely known by metagenomic assemblies that are often incomplete. However, the presence of paralog copies of rhodopsins is interesting, particularly when the sequence is widely divergent since they may provide different functions to the microbe. Hence paralog rhodopsins are a potential source of novel functions of these versatile proteins that have also biotechnological potential for their use in optogenetics (23, 24).

Here, we evaluate the genetic repertoire of rhodopsins in a large collection of single-amplified genomes (SAGs), which were generated from a global set of seawater samples from the tropical and subtropical, epipelagic ocean (25). The screening for rhodopsins resulted in approximately 3% of SAGs containing two type-1 rhodopsin paralogs within their genomes. While one of them was predicted to be a standard proton pump, the other paralog belonged to a formerly described cluster of rhodopsins discovered in contigs assembled from deep-water metagenomes in station ALOHA (26). The presence of this second rhodopsin was found restricted to three taxa known to have small and streamlined genomes—Pelagibacterales, HIMB59, and the Gammaproteobacteria Pseudothioglobus. This rhodopsin gene was always found next to a flotillin gene. Metagenomic screening in marine and freshwater samples revealed a broad distribution of these sequences in both types of environments, always associated with deeper low-light intensity habitats. Our results on the protein structure, metagenomic recruitment, and genomic context can help to understand its function in these streamlined microbes.

MATERIALS AND METHODS

Screening of rhodopsin proteins from single-amplified genomes and metagenomic datasets

To evaluate the presence and the number of rhodopsin sequences in the euphotic marine microbiome, a compendium of 12,715 SAGs was downloaded from NCBI under BioProject PRJEB33281 (25) and subjected to study. Prior to the analysis, the degree of completeness and contamination of SAGs were estimated using CheckM v1.1.2 (27), and SAGs with > 50% completeness and < 5% contamination were kept. Taxonomic classification of SAGs was performed using the GTDB-Tk v2.1.0 tool (28) using the Genome Taxonomy Database (GTDB) release R207 (29).

Metagenomes from different marine and freshwater datasets were used to retrieve similar protein sequences to FArhodopsins, as well as the contigs containing them. Marine datasets included metagenomic assemblies and PacBio CCS reads from a local time series in the Mediterranean Sea (30, 31), Hawaii Ocean Time-series (HOTs), and Bermuda Atlantic Time-series Study (BATS) (32), as well as from large oceanic expeditions, such as Tara Oceans (33) and GEOTRACES (32). Freshwater datasets represented metagenomic assemblies from Lake Baikal (34, 35), Spanish reservoirs (36), and a compendium of 17 different freshwater lakes located in Europe and Asia (37). Assembled contigs from marine and freshwater metagenomes longer or equal to 5 Kb were assigned family- or genus-level classification if at least 50% of the genes shared the same best-hit taxonomy. Contigs failing this threshold were classified at the level of phylum.

Recovery of rhodopsin proteins was achieved using a custom and curated database of rhodopsin sequences, comprising all the families described so far from both type-1 and type-3 rhodopsins. HMMs of these families were built using hmmbuild (38) after a previous alignment by muscle (39). Putative protein sequences were then screened using hmmscan (38). Only sequence hits >= 200 aa, with an evalue < 1e-10 and a bitscore >= 75 were kept.

Phylogenetic classification of microbial rhodopsins

A maximum likelihood phylogenetic tree was built with all the type-1 microbial rhodopsins recovered from the pool of marine SAGs, as well as from the metagenomic sequences retrieved from marine and freshwater datasets. As references, we included well-characterized rhodopsin proteins with several functions described so far (H+, Na+, Cl-, and sensory rhodopsins). Sequences were first aligned with muscle (39) and the resulting alignment was trimmed with trimal (40) to remove regions with gaps in more than 80% of the sequences. The phylogenetic tree was built with iqtree (41) with 5,000 ultrafast bootstraps and the -m MFP option to find the best model that fitted our data.

Determination of gene clusters carrying FArhodopsins

In order to determine the location of the FArhodopsin and the absence of it in other genomes from the same taxonomic group, Pelagibacterales SAGs containing the second rhodopsin were compared using blastp (42) against the isolate genomes HIMB083 (genomospecies Ia.3/V), HTCC7011 (genomospecies Ia.3/I), and HTCC1062 (genomospecies Ia.1/I). Genomic regions were then extracted and genes were predicted for further analyses. In a similar approach, the genomic islands containing the rhodopsin in HIMB59 and the gammaproteobacterial Pseudothioglobus were determined and extracted by comparing the genomes against closely related (>90% average nucleotide identity [ANI]) SAGs.

Taxonomic and functional annotation of rhodopsin-carrying contigs

Prodigal v2.6.3 (43) was used to predict genes from contigs retrieved from the individual SAGs, as well as from the metagenomic assemblies. Predicted protein-encoded genes were taxonomically and functionally annotated against the NCBI NR database using DIAMOND 0.9.15 (44) and against COG (45) and TIGRFAM (46) using HMMscan v3.3 (38). Domains within proteins were predicted using InterPro (47). In order to determine the location of predicted proteins within the gene cluster, transmembrane domains and signal peptides were identified with DeepTMHMM (48) and SignalP v6.0 (49), respectively.

Retrieval of SPFH proteins and phylogenetic tree of flotillin sequences

Domain sequences of the SPFH family of proteins were downloaded from the Pfam database (50) and then used to search for candidates within SAGs with hmmscan. The resulting flotillin sequences were then phylogenetically classified. Reference sequences from eukaryotic flotillins Flot-1 and Flot-2 (Bos Taurus, Homo sapiens, and Rattus norvergicus) together with well-characterized sequences from Bacillus, Staphylococcus, and E. coli were downloaded from the Uniprot database. We also included the closest hits determined by a blastp search against the NCBI protein database. Sequences were first aligned with muscle, and the resulting alignment trimmed with trimal to remove regions with gaps in more than 80% of the sequences. A maximum-likelihood phylogenetic tree was built with iqtree with 5,000 ultrafast bootstraps and the -m MFP option to find the best model that fitted to our data.

Metagenomic and metatranscriptomic read recruitment

Illumina raw reads from Tara Oceans metagenomes and metatranscriptomes were downloaded from the European Nucleotide Archive (ENA) following the accession numbers PRJEB1787 and PRJEB6608, respectively. Eight additional samples covering a metagenomic depth profile from the Mediterranean Sea during thermal stratification (summer—15, 30, 45, 60, 75, 90, 1,000, and 2,000 m) were downloaded from NCBI BioProjects PRJNA352798 and PRJNA257723. Raw reads were trimmed with Trimmomatic v0.39 (51). Coverage values were calculated by read alignment (in subsets of 20 million reads) against the genomic islands containing the FArhodopsin using BLASTN v2.9.0 (90% identity, > 50 bp alignment). At least 70% of the length was required to be covered by reads to be considered a recruit. Reads were normalized by the size of the contig in Kb and by the size of the metagenome in Gb (RPKGs). Transcripts from metatranscriptomic samples were aligned using BLASTN (98% identity, > 50 bp alignment) and normalized as transcripts per kilobase of gene per million reads (RPKM).

In silico determination of protein structures

We used AlphaFold (52), a protein structure prediction server hosted on Google Colab (53), to predict the structures of three FArhodopsins (AG-390-L03 of Pelagibacterales, AG-337-N20 of HIMB59, and AG-313-D08 of Pseudothioglobus) using standard parameters. We then used PyMOL (https://pymol.org/2/) to align these structures to the structure of a blue-light absorbing proteorhodopsin from Med12 (PDB ID 4JQ6). Since the FArhodopsins and proteorhodopsin shared high overall similarity and the positions of the lysine residues, which are putative binding residues for retinal coincided, we transferred the all-trans retinal molecule from proteorhodopsin (PDB ID 4JQ6) to the FArhodopsins for further analysis of the retinal binding pocket.

RESULTS AND DISCUSSION

Paralog rhodopsin recovery and diversity

To identify rhodopsin paralogs, we first tried to locate them in cultivated marine microbes genomes (RefSeq database, April 2022), but none were found, with the known exceptions mentioned above. This was not surprising considering the narrow range of diversity covered by cultivated marine bacteria, particularly of streamlined and abundant microbes. Fortunately, an alternative to retrieve monoclonal genomes of marine bacteria is SAGs. Although many of them are incomplete, they offer a significant guarantee that they start from a single cell and that the redundant genes found in them are real paralogs. Thus, we used a large collection of marine SAGs, composed of 12,715 partial genomes, which were produced using a random cell selection strategy from the epipelagic seawater samples from tropical and subtropical latitudes of the Atlantic and Pacific Oceans (25). The taxonomic composition of marine prokaryoplankton in this database is consistent with previous 16S rRNA amplicon and shotgun metagenomic studies of marine off-shore waters, indicating that this dataset is representative of the in situ microbial communities (25). We removed SAG assemblies with an estimated completeness lower than 50%, which is the threshold for classifying them as medium quality (54). Finally, a total of 4,751 SAGs were screened for the presence of rhodopsin genes in their genomes by training HMMs against a large collection of type-1 and type-3 known rhodopsin proteins (see Materials and Methods). Among them, 63% coded for type-1 rhodopsins (Fig. 1A). As previously described from metagenomic assemblies, most of these sequences were found in microbes from the order Pelagibacterales (#1,619), SAR86 (#386), Flavobacteriales (#258), HIMB59 clade (#191), and Ca. Actinomarinales (#139), as well as other less abundant taxa (Table S1).

Fig 1.

Fig 1

(A) Pie charts indicating the percentage of single-amplified genomes (SAGs) in which two, one, or no type-1 rhodopsins were detected. Only those SAGs (from a large database collected from the tropical and subtropical euphotic ocean (25), that met the established quality criteria of ≥50% completeness and ≤5% contamination, i.e., medium to high quality draft genomes were included in the analysis. For those SAGs containing two rhodopsin genes, a second pie chart indicates their taxonomic affiliation. (B) Maximum likelihood phylogenetic tree of type-1 rhodopsin genes detected in the marine SAGs, together with several rhodopsin references covering proteorhodopsins, xenorhodopsins, xanthorhodopsins, marine euryarchaeal rhodopsins, flavobacterial rhodopsins, marine and freshwater actinobacterial rhodopsins, and haloarchaeal halorhodopsins and bacteriorhodopsins. Branches have been colored for SAGs harboring two rhodopsins according to their taxonomy (blue, Pelagibacterales; purple, HIMB59; and green, Pseudothioglobus). Sequences belonging to the new clade of rhodopsins (Flotillin-associated rhodopsin; FArhodopsin) are highlighted in red. Branches within color background contain type-1 rhodopsins from the SAG dataset. The yellow circles correspond to the SAGs containing the two type-1 rhodopsin paralogs. (C) Protein sequence alignment of Helix C and G residues for the two type-1 rhodopsins. Two reference proteorhodopsin proton pumps at the top. Residues involved in rhodopsin function (e.g., ion translocation) and wavelength absorption are highlighted in blue and orange, respectively. The lysine (K) highlighted in red (helix G) is responsible for the covalent binding of retinal. Dark and light grey regions represent conserved residues within the alignment.

As previously reported, Ca. Actinomarinales SAGs coded always for type-1 and type-3 rhodopsins (55), but a small set of SAGs (~3%) coded for two type-1 rhodopsins (paralogs) within their genomes. A close inspection revealed that they all belonged to three taxonomic groups: Pelagibacterales (SAR11), HIMB59 (formerly AEGEAN-169), and the gammaproteobacterial Pseudothioglobus (Fig. 1A), all typical marine streamlined microbes. Remarkably, none of the genomes coding for a type-3 rhodopsin coded for two type-1 paralogs (Table S1). To analyze the phylogenetic diversity of the recovered paralogs of type-1 microbial rhodopsins, a maximum likelihood phylogenetic tree was constructed, including several reference sequences from culture genomes and MAGs covering well-known type-1 rhodopsins (Fig. 1). The phylogenetic tree showed that although one of the rhodopsin sequences clustered according to their taxonomic classification and into the proton-pump proteorhodopsin family (Fig. 1B), the sequences of the second rhodopsin paralog, which we named FArhodopsin, clustered on an independent branch (Fig. 1B).

An analysis of the protein alignment of the two paralog rhodopsins (Fig. 1C) indicated that all seemed to be potentially functional. Both paralogs code for the lysine (K) in helix G responsible for the binding of the retinal to the protein via a Schiff base reaction although structural and experimental predictions make it unlikely that one of them binds retinal (see below). In addition to many major differences, including a deletion of 23 aa at the N terminus of the second rhodopsin (FArhodopsin), which might include a leader peptide (Fig. S1), we observed differences in the ion-pumping motif. While one of the paralogs showed the DTE motif (positions 85, 89, and 96 according to the amino acid sequence of bacteriorhodopsin from Halobacterium salinarum) identical to the green and blue light proton-pumping proteorhodopsins (Fig. 1C), in the FArhodopsin we detected three variations. In the case of Pelagibacterales and the Pseudothioglobus, the FArhodopsin differed in the last amino acid of the triad, being threonine (DTT) as has been described previously for metagenomic assembled rhodopsins from ALOHA (26) and leucine (DTL), respectively, while in HIMB59, the DTE motif was replaced by aspartate-asparagine-isoleucine (DNI). In addition, the amino acid residue suggested to be critical to the spectral tuning of rhodopsins (position 93 in bacteriorhodopsin) also changed. Leucine, methionine, and isoleucine residues are detected commonly in proteorhodopsins absorbing in the green light spectrum, whereas a rhodopsin containing glutamine absorbing in the blue light range. The FArhodopsin sequences of HIMB59 and Pseudothioglobus had either the glutamine or leucine, while the Pelagibacterales FArhodopsin coded for threonine at that residue.

The genomic context of FArhodopsins

The genomic comparison of Pelagibacterales SAGs containing the second rhodopsin against the genomes of isolates HIMB083 (genomospecies Ia.3/V), HTCC7011 (genomospecies Ia.3/I), and HTCC1062 (genomospecies Ia.1/I) (56) indicated that FArhodopsins were always located on a gene cluster of approx. 26 Kb that was absent from the isolate genomes (Fig. 2A). Although SAGs containing this cluster were phylogenetically distant from each other, as they belonged to different subclades Fig. 2B, most of the cluster genes were conserved and syntenic (Fig. 2B). Despite some genomic rearrangements, the rhodopsin in the cluster always appeared co-localized on the same strand next (16 nucleotides intergenic spacer) to a large gene (1,977 bp) predicted to code for a flotillin. Flotillins are members of a widely conserved superfamily of proteins termed Stomatin, Prohibitin, Flotillin, and HfK/C (SPFH) domain proteins (57). Eukaryotic flotillins have been extensively studied (58) and only later they were found in prokaryotes (59, 60). They are classified as scaffold proteins associated with lipid rafts, lipid microdomains in membranes, which enhance the assembly of raft-associated proteins (61, 62). The gene clusters containing the rhodopsin in HIMB59 and Pseudothioglobus were delimited by comparing the genomes against closely related (>90% average nucleotide identity [ANI]) SAGs (the pure culture HIMB59 available was too distant). Regardless of the gain or loss of some genes, synteny was maintained among HIMB59 relatives, despite a wide phylogenetic distance. Even Pseudothioglobus (a gammaproteobacterium) genomes conserved the flotillin and the three last genes of the cluster located on the same strand and potentially co-transcribed (Fig. 2B and Table S2). We evaluated the presence of flotillin coding genes as well as any other “SPFH” domain-containing proteins in the initial set of marine SAGs (see the Methods section). Among them, nearly 85% were positive for at least one “SPFH” domain-containing hit. A manual inspection resulted in a broad classification of these proteins as HflC and HflK. These proteins form the complex HflKC, involved in the regulation of the protease FtsH not related to the eukaryotic flotillins (63). Otherwise, the flotillin gene was detected mainly in genomes coding for the FArhodopsin (only 12 bona fide flotillin proteins were detected not belonging to the rhodopsin gene cluster) (Fig. S2). Phylogenetic classification of detected flotillins against well-characterized eukaryotic and prokaryotic sequences resulted in flotillin homologs from Pelagibacterales, HIMB59, and Pseudothioglobus forming a separate cluster (Fig. S2).

Fig 2.

Fig 2

(A) Genomic location of the FArhodopsin gene cluster and its neighboring genes (colored in light blue the four closest genes at the boundaries) of Pelagibacterales SAG AG-390-L03 in relation to the reference genomes HTCC1002, HTCC7211, and HIMB083. Genes colored in yellow in the upper alignment represent the (flotillin-associated (FAR) gene cluster. The FArhodopsin and the flotillin are colored in red and blue, respectively. (B) Genomic alignment (amino acids) of the flotillin-associated rhodopsin gene cluster found in Pelagibacterales, HIMB59 and Pseudothioglobus genomes. The ANI value for the whole genome is indicated. The region containing the FArhodopsin and flotillin genes is highlighted in a red box. Genes that are homologous are colored with the same color. Annotation is provided for the AG-390-L03 genome.

The analysis of the genes found in the FArhodopsin cluster (FAR cluster) of the Pelagibacterales SAGs (Fig. 2B) shows very conserved synteny despite the divergence (ANI) of the overall genomes. The gene cluster is always bordered by long spacers (236 and 219 nucleotides in the AG-390-L03) that are also well conserved. After the spacer, there is a set of 10 genes on the same strand and likely also co-transcribed (see below). The next gene is annotated as a sterol-binding-domain of a sterol-carrier protein, likely involved in lipid transport. There is also an inner membrane translocator and a membrane sodium symporter. Thus, a correlation between membrane structure and function is strongly suggested. Only Pelagibacterales and HIMB59 encoded for a cluster of three genes annotated as biotin-dependent carboxyltransferases and carboxylases that might be involved in the degradation of urea. Other metabolic genes detected included an N,N-dimethylformamidase, involved in the glyoxylate and dicarboxylate metabolism (64); the PxpA-like subunit, involved in the conversion of 5-oxoproline to L-glutamate (65); as well as some uncharacterized alpha/beta hydrolases, iron-dependent alcohol dehydrogenases, and aldolases. In addition, the Pseudothioglobus FAR cluster encoded for the transcriptional regulator LuxR and a hybrid sensor histidine kinase/response regulator (genes annotated in the FAR gene cluster are listed in Table S2).

FArhodopsins in aquatic metagenomes

The presence of homologs to FArhodopsin in the ALOHA station depth profile has been extensively studied (see above) (26). The authors concluded that the DTT-T rhodopsin was present mainly in samples from 200 to 1,000 m depth. Thus, to recover more genetic diversity of FArhodopsins we analyzed the assemblies of marine metagenomes, including western Mediterranean Sea depth profiles (30, 31), BATS (32), and global ocean surveys such as Tara Oceans (33) and GEOTRACES (32). A western Mediterranean sample has also been analyzed recently with long-read HiFi sequencing (66) that does not need assembly to recover complete gene clusters. Considering that microbes of the order Pelagibacterales, which are the main contributors of FArhodopsins, can be also found in freshwater ecosystems (e.g., Fonsibacter), we included in the analysis samples taken from lakes in North America (67), Europe, and Asia (36, 37, 68), including Lake Baikal (34, 35), where representatives of the Pelagibacter Ia.5 (34, 56) clade were recently found.

Table S3 summarizes the number of rhodopsins type-1, type-3 (heliorhodopsins), FArhodopsins, and flotillins recovered in each metagenomic sample. In agreement with what we found in the dataset of marine SAGs, microbial type-1 rhodopsins represented the majority of rhodopsin sequences in marine and freshwater assembled contigs, followed by heliorhodopsins. We could recover hundreds of FArhodopsin sequences, including from freshwater datasets (Table S3). This analysis showed the uneven recovery of FArhodopsin and flotillin sequences, as indicated by the frequency on which these two genes appear together in a single contig, which could call into question the strong association described previously between FArhodopsin and flotillin (83.6% ± 7.2%, Table S3). However, a comprehensive study of the assembled contigs containing the rhodopsin revealed that most of them had FArhodopsins at the contig ends, indicating a breakpoint in the assembly, a common occurrence in regions of high microdiversity. Only in the metagenomic sample from the Mediterranean Sea sequenced with PacBio CCS (66), FArhodopsin and flotillin numbers were almost at par (407 and 455 sequences, respectively, 93.1% of the FArhodopsin were collocated with a flotillin, Table S3). All FArhodopsins well covered by the long reads contained the tandem confirming a consistent association. This result illustrates one of the major advantages of long-read metagenomics, allowing to recover complete genes and even operons from the bulk of CCS reads, without relying on metagenomic assembly.

The phylogeny of all of the recovered FArhodopsin proteins (#912) resulted in two major phylogenetic branches, separating the marine from the majority of the freshwater sequences (Fig. 3A). All marine metagenomic FArhodopsin-containing assemblies were classified within Pelagibacterales, HIMB59, and Pseudothioglobus clades, regardless of depth or geographical origin (e.g., latitude), indicating that apparently no other microbes encode FArhodopsin globally. Interestingly, one cluster composed of sequences from deep samples of Lake Baikal (390 m, Russia) and Lake Constance (200 m, Switzerland) grouped within the Pelagibacterales FArhodopsin. A close inspection revealed a well-conserved synteny between these contigs and the marine SAG AG-390-L03 (Fig. 3B). Based on gene annotation of the surrounding regions, these sequences belonged to the freshwater Ia.5 Pelagibacter clade, which was reported for the first time in freshwaters in Lake Baikal (34). Other freshwater metagenomes, on the other hand, showed a less restricted taxonomic distribution of FArhodopsins, with several distinct subclades indicating a higher sequence diversity (Fig. 3A). Interestingly, protein alignment of these novel freshwater metagenomic sequences revealed that most did not encode for the lysine (K) in helix G responsible for the covalent binding of retinal to the protein, but they showed the amino acid variants glutamine, threonine, leucine, or arginine at this site (Fig. 3A and Table S4). Figure 3B also shows that despite the divergence, sequences coming from freshwater preserve some genes of the FAR cluster found in the marine FARs. Besides, although the contigs harboring this protein were short hampering their taxonomic classification (as is typical of complex communities), those that were long enough always encoded for a flotillin next to the rhodopsin gene, highlighting the key role the flotillin gene plays in the cluster. Lysine-less rhodopsins (Rh-noK) have been already described for a group of xanthorhodopsins-related sequences in marine betaproteobacteria (12) and were shown to be retinal binding and functional proton pumps when the lysine residue was reconstituted by genetic manipulation. Interestingly, Rh-noK was found located in tandem with a functional proteorhodopsin gene, and the authors speculated that both gene products could form a multimer in which Rh-noK could regulate the proton pumping activity of the neighbor (12).

Fig 3.

Fig 3

(A) Maximum likelihood phylogenetic tree of FArhodopsin sequences detected from marine and freshwater datasets were colored in red and blue, respectively. The full description of samples can be found in Table S4 and within the Materials and Methods section. sequences with 100% amino acid identity with those in the tree are not shown. Sequences in the blue background branching to the right (all from freshwater metagenomes) do not have lysine in the retinal binding position in helix G (Table S4). Numbers within freshwater branches represent clades on which a contig was used as reference for the genomic comparison shown in B. (B) Genomic comparison of freshwater FArhodopsin subclades, using the Pelagibacterales SAG AG-390-L03 (subclade Ia.3) as a marine reference. Numbers between contigs represent the average amino acid identity (AAI, %) of shared proteins between them.

Recruitment of metagenomic and metatranscriptomic fragments on the FAR gene cluster

To assess the abundance and distribution of this novel group of rhodopsins, we performed metagenomic fragment recruitment analysis by recruiting all the FAR gene clusters in the SAGs of Pelagibacterales (#24), HIMB59 (#69), and Pseudothioglobus (#10) taxa to 140 metagenomes from a depth profile in the Mediterranean Sea (31) and the Tara Oceans datasets (33). As expected, the three taxa showed differential depth recruitment in the Mediterranean Sea (Fig. 4A), and the same was found for the Tara Oceans datasets containing deeper samples (Fig. S3). Pelagibacterales are the most abundant microbes in surface layers of the water column, although it is known that this diverse order has an eurybathic distribution, with some clades being detected in the deep chlorophyll maximum (DCM), lower photic (LP) metagenomes, and in meso- and bathypelagic samples (31). However, the Pelagibacterales FAR cluster recruited more in deeper samples such as the DCM, lower photic, and bathypelagic samples, represented by 45, 60, 75, 90, 1,000, and 2,000 m depths in the available Mediterranean metagenomes (57) (Fig. 4A). A similar preference for deeper waters was found globally with the Tara Ocean samples (Fig. S3). Thus, confirming the trend discovered for the DTT-T rhodopsins found by Olson et al at ALOHA (26). In the Mediterranean metagenomes, the maximum recruitment of the cluster was found at the LP (75, 90 m), much shallower than the 500 m detected at the more oligotrophic, permanently stratified waters at ALOHA (26). A similar trend was found for the Gammaproteobacteria Pseudothioglobus. Contrastingly, HIMB59 FAR (with the DNI motif in the FArhodopsin) recruited the most at the upper photic (15, 30 m deep), following closely the recruitment of the full SAGs (data not shown) that were detected almost exclusively in the upper photic. In addition, we could not find a clear association of freshwater FArhodopsins with deeper waters, in Lake Baikal they appeared near the surface and down to 390 m deep but not in a bathypelagic sample (data not shown). In freshwaters, the photic zone is harder to delimit given the diversity of trophic status that characterize lakes and their overall shallower depths.

Fig 4.

Fig 4

(A) Relative abundance (measured in reads per kilobase of genome per gigabase of metagenome—RPKG) of the flotillin-associated rhodopsin gene clusters in a depth profile metagenomic study from Western Mediterranean Sea collected in a single offshore location during a period of marked thermal stratification (31). Metagenomes are grouped by depth (i) upper photic (UP, 15 and 30 m), (ii) deep chlorophyll maximum (DCM, 45 and 60 m), (iii) lower photic (LP, 75 and 90 m), and deep (Deep, 1,000 and 2,000 m) layers. (B) Metagenome and Metatranscriptome analysis of flotillin-associated rhodopsin gene cluster. The upper panel show read coverage of the metatranscriptomic Illumina reads from the 60 m sample of the profile (98% identity, 250 bp window). In the lower panel, a metagenomic recruitment plot using raw reads from the samples 15 m and 60 m (> 70% identity, > 50 bp long) is represented. FAR gene cluster is highlighted in red. (C) Read coverage showing individual reads in the 60 m metatranscriptome of the FAR gene cluster showing that the FArhodopsin and flotillin genes are co-transcribed (there are reads covering both genes simultaneously).

Metatranscriptomes were available from the same seawater samples (Western Mediterranean Sea, 15 and 60 m) (69). Their recruitment on SAGs confirmed the transcription of the FAR cluster of Pelagibacterales AG-390-L03 at 60 m (Fig. 4B and C) but, as expected, no matching transcripts were found at 15 m. The tandem rhodopsin-flotillin was the second most recruited region in the cluster (Fig. 4B), indicating that these genes are abundantly expressed. The average recruitment value (in reads per kilobase gene per million transcripts—RPKM) of the FAR (0.32 RPKM) was substantially above the average number of transcripts aligning to the genome (0.2 RPKM, with the rRNA ribosomal operon excluded). The transcriptomic read coverage indicates clearly that both FArhodopsin and flotillin (at least) are cotranscribed in a single polycistronic mRNA despite the 16 nucleotides separating both genes (Fig. 4C). Several mRNA reads containing the spacer could be detected (data not shown).

Predicted structure of the retinal binding pocket

The most notable characteristic of microbial rhodopsins is their capability to respond to light through the use of the retinal cofactor. However, when (26) expressed in Escherichia coli a rhodopsin gene containing the DTT motif, they detected no binding of retinal, judging from the pellet pigmentation of centrifuged cells nor change in pH upon illumination. The lack of function in E. coli can be associated with a lack of proper membrane insertion into a lipid raft due to the lack of the flotillin-associated protein that was missing in the E. coli construct. Interestingly, Olson et al. (26) modified a functional DTE-Q sequence to DTT-T, which resulted in the loss of function.

We used AlphaFold (52) to predict the structure of marine FArhodopsin to examine its retinal binding pocket. The predicted structure showed high similarity to that of the closest homolog with an experimentally determined high-resolution structure (PDB: 4jq6, blue proteorhodopsin from Med12, a predicted protein from a metagenomic sample from the Mediterranean Sea at 12 m deep (70)), which allowed us to adapt the retinal position to the predicted structure. However, when analyzing the amino acids that are highly conserved among Pelagibacterales FArhodopsins (present in more than 99% of cases), we observed that four of these residues would intersect the retinal in its all-trans conformation (Fig. 5A). A similar observation was made in other groups of marine FArhodopsins (Fig. 5B and C). Regarding the whole amino acid alignment of the three marine FArhodopsins, we detected that only 98 out of 239 residues were highly conserved (present in more than 99% of cases, Table S5), indicating that, although been grouped within the same cluster, sequences from the three marine taxa are quite distinct. In addition, as mentioned above, most of the freshwater FArhodopsins lack the lysine that is used to bind retinal. These findings suggest that all FArhodopsins may be retinal-less like those previously described in the Halobacteriaceae (11) or betaproteobacteria Rh-noK (12). Their seven transmembrane helices may have evolved to perceive other stimuli, similar to G protein-coupled receptors (G-PCRs), which have adapted for responding to both light and chemical activation or they could be involved in either regulation or membrane microdomain organization as shown for some G-PCRs in the animal retina (71). However, it is important to note that there is no direct evidence that the AlphaFold prediction is precisely correct. Recently, an unwound helix in a structure of a channelrhodopsin called ChRmine was not predicted by AlphaFold (72). It is possible that another significant distortion from the classic rhodopsin fold could still allow for binding of the retinal or that FArhodopsins may bind other carotenoid derivatives to absorb light.

Fig 5.

Fig 5

AlphaFold prediction of retinal binding pockets in marine FArhodopsins from different subclades: Pelagibacterales (A), HIMB59 (B), and Pseudothioglobus (C). Retinal is depicted in aqua green, and only residues that are present in more than 99% of the subclade family members are displayed. Residues that clash with the retinal are indicated and highlighted in red. Crystal structure of blue light-absorbing proteorhodopsin from Med12 (PDB: 4JQ6) is shown in (D) for comparison.

Synthesis

The discovery, two decades ago (13), of rhodopsin sequences in marine prokaryoplankton communities resulted in a significant leap forward in the understanding of the biology of these microorganisms and their interaction with the ecosystem. In the photic zone, the proton-pumping activity of rhodopsins may provide “cheap” solar energy (73). However, rhodopsins can perform a wide variety of other functions (2, 4). Furthermore, the presence of paralogs with very divergent activities and even sequences has been known since the first microbial rhodopsins were described in halophilic archaea, and they are intensively looked after since they reveal novel physio-ecological roles and can be of potential application for optogenetics. In this work, we have analyzed the distribution and the presence of rhodopsin paralogs in a large dataset of marine SAGs (25) that provide strong evidence that the detected genes come from the same genome. We tried with the meager representation of cultured microbes but, unsurprisingly, could not find any examples of genomes with two rhodopsins. MAGs, on the other hand, are a composite of genomic fragments, and the presence of contigs belonging to different organisms (chimerism) may interfere with the analysis.

Unsurprisingly, only a small subset of SAGs encode multiple rhodopsins, most of which belonged to well-known streamlined lineages, such as Pelagibacterales (74 - 77), Ca. Actinomarinales (55, 78), and SAR86 (79 - 81). These microbes are characterized by having a small genome size, a higher coding density, and a reduced set of paralogs, to cope with limited nutrient and energy availability in the oligotrophic open ocean (82). Nevertheless, we detected two copies of the rhodopsin gene in the genomes of three widespread streamlined taxa, Pelagibacterales, HIMB59 and the Gammaproteobacteria Pseudothioglobus. One paralog was a typical proteorhodopsin proton pump, while the other was affiliated with a novel clade described among assembled contigs in ALOHA metagenomes at depths of 200 to 1,000 m (26). The access to the genomic context, which was not possible previously, showed a consistent association with a conspicuous flotillin gene forming a single mRNA transcript. Flotillins are members of a superfamily of proteins termed “SPFH” (Stomatin, Prohibitin, Flotillin, and HfK/C) (57), detected for the first time in the cytoplasmic membrane of eukaryotes (58). These proteins are involved in the formation of microdomains (also known as lipid rafts) in the plasma membrane of all mammalian cells (58). Its biological role is not well understood, but it is believed that they act as scaffold proteins, enhancing the recruitment of raft-associated proteins to these microdomains to be active, as well as facilitating their interaction and oligomerization with other proteins (61). Flotillin sequences have been also detected in bacteria. Specifically in Bacillus subtilis, which encodes for two subunits, YuaG and YqfA (renamed as FloT and FloA) (61). Mutants lacking both genes showed reduced sporulation efficiency and changes in the biofilm formation (59, 60, 83). Remarkably, our results indicated that the flotillin encoded in the FAR cluster was the only copy present in the genome, and therefore, its presence might be essential for the formation of specific microdomains in these organisms. In B. subtilis (and other bacteria), flotillin appears in lipid rafts that concentrate the phospholipid cardiolipin and cyclic (hopanoids) or noncyclic isoprenoids, such as those derived from carotenoids (61). The genomes of Pelagibacterales, HIMB59, and Pseudothioglobus encode for a cardiolipin synthase. For that reason, the presence of lipid rafts, despite not being demonstrated “in vivo,” seems plausible. We would like to hypothesize that the flotillin detected in the genomic island might be critical for the function of the FAR cluster genes. Many FAR genes annotated were classified as membrane transporters or signal transduction/regulators, proteins typically enriched in flotillin-associated microdomains (59, 62).

In the Mediterranean, FArhodopsin genes were abundant in metagenomes coming from the lower photic zone, with a maximum at the deep-chlorophyll-maximum (45, 60 m deep). In contrast, FArhodopsin genes peaked in deeper waters (500 m) at ALOHA station in the Pacific Ocean. This fact, together with the lack of in vitro activity and even retinal binding of some ALOHA representatives cloned in E. coli suggests that these rhodopsins might be retinal-less or even blind rhodopsins (not using light). Furthermore, most freshwater FArhodopsins do not have the lysine residue needed for retinal binding. However, a more in-depth screening of several metagenomic datasets indicates that they tend to associate with the photic (or twilight) zone, but not with bathypelagic waters. In addition, we have always detected a standard retinal-binding proton pumping proteorhodopsin present elsewhere in the genome, which indicates a clear light-interactive lifestyle (at least periodically). A possible explanation would be that flotillins enhance the formation of a heterocomplex of DTE and FArhodopsins in lipid rafts. This kind of rhodopsin heterocomplexes has been described in fungi (3, 84), on which a classic blue or green-absorbing rhodopsin cyclase (a kind of rhodopsin linked to a type III guanylyl cyclase) interacts with another rhodopsin cyclase, but with an absorption in the near-infrared spectrum. Thus, there is a chance that the FArhodopsin interacts with the proton-pump proteorhodopsin, present in the same genome, in a heterocomplex, modifying the spectral tunning to accommodate ion transport to deeper waters.

The presence of FArhodopsins in freshwater metagenomes came as a surprise. They are much more diverse at the sequence level than marine FArhodopsins and are present in a much wider taxonomic array of prokaryotes. The association with flotillin was also detected in all cases in which contigs were long enough suggesting similar functional properties. In one case, we found a FAR cluster syntenic with a marine Pelagibacter genome in a distant relative from Lake Baikal. The high similarity of the FAR cluster between marine and freshwater Pelagibacterales might indicate marine-to-freshwater transitions of the FAR gene cluster (34).

There is always the possibility that FArhodopsins act as sensors of other stimuli (as hypothesized for retinal-less rhodopsins of haloarchaea (11) or Rh-noK (12), but the lack of genes coding for transducers in the FAR cluster and the association with the photic zone in many metagenomes indicates otherwise. We hypothesize that FArhodopsins may act as complements to the bona fide proton pump, by increasing its performance under the dimmer light conditions prevalent in the lower photic zone. The streamlined nature of genomes containing FArhodopsin and the high conservation detected in the FAR gene cluster, even across marine-freshwater transitions, point to an important role in the physiology and ecology of some of the most abundant aquatic microbes. Perhaps extending the “solar energy” strategy to deeper waters where a large part of the water column microbiome activity takes place.

ACKNOWLEDGMENTS

This work was supported by grant “FLEX3GEN” PID2020-118052GB-I00 (cofounded with FEDER funds) from the Spanish Ministerio de Economía, Industria y Competitividad to M.L.-P. and F.R.-V. J.M.H.-M. was supported with a Ph.D. fellowship from Margarita Salas program, cofounded by the Spanish Ministerio de Universidades and the European Union -- Next Generation EU (2021/PER/00020). R.S. was supported by the Simons Foundation grants 510023 and 827839. The work of A.A. was supported by funding of the German Research Foundation to Dr. Tobias Moser via the Multiscale Bioimaging - Cluster of Excellence (EXC 2067/1-390729940).

The comments and suggestions of Oded Beja on the preprint of this paper are gratefully acknowledged.

J.M.H.-M. conceived the study. J.M.H.-M., M.L.-P., A.A., E.P., K.K., and F.R.-V. analyzed the data. J.M.H.-M., M.L.-P., A.A., V.G., R.S., and F.R.-V. contributed to write the manuscript.

The authors declare that they have no competing interests.

Contributor Information

Francisco Rodriguez-Valera, Email: frvalera@umh.es.

Steven J. Hallam, University of British Columbia, Vancouver, Canada

SUPPLEMENTAL MATERIAL

The following material is available online at https://doi.org/10.1128/msystems.00008-23.

FIG S1. msystems.00008-23-s0001.pdf.

Complete protein sequence alignment of six proteorhodopsin and six FArhodopsin sequences. Sequences come from six SAGs (two of each taxa) containing both rhodopsins in their genomes. Dark and light grey regions represent conserved residues within the alignment.

DOI: 10.1128/msystems.00008-23.SuF1
FIG S2. msystems.00008-23-s0002.pdf.

Maximum-likelihood phylogenetic tree of flotillin genes retrieved from marine SAGs. Reference flotillin sequences and their accession numbers within brackets are colored in black. Sequences in red represent sequences from marine SAGs that were not found near a FArhodopsin. Sequences in blue correspond to the flotillins of the three taxa coding for the tandem flotillin-FArhodopsin. Sequences in this branch were condensed for simplicity.

DOI: 10.1128/msystems.00008-23.SuF2
FIG S3. msystems.00008-23-s0003.pdf.

Relative abundance (measured in RPKG) of the FAR gene cluster in Tara Ocean metagenomes. Each province has been divided by depth in surface (SRF), deep chlorophyll maximum (DCM), and mesopelagic (MES) regions. The bars have been colored according to the microbes (blue, Pelagibacterales; purple, HIMB59; and green, Pseudothioglobus).

DOI: 10.1128/msystems.00008-23.SuF3
TABLE S1. msystems.00008-23-s0004.xlsx.

Summary of marine SAGs, with their taxonomic classification, and the number of type-I and type-III rhodopsin genes detected.

DOI: 10.1128/msystems.00008-23.SuF4
TABLE S2. msystems.00008-23-s0005.xlsx.

Functional annotation of genes encoded within the FAR gene clusters of Pelagibacterales, HIMB59, and Pseudothioglobus. Only one FAR gene cluster per taxa is shown.

DOI: 10.1128/msystems.00008-23.SuF5
TABLE S3. msystems.00008-23-s0006.xlsx.

Number of rhodopsin sequences per metagenomic data set.

DOI: 10.1128/msystems.00008-23.SuF6
TABLE S4. msystems.00008-23-s0007.xlsx.

Protein alignment of rhodopsin metagenomic sequences. Amino acids involved in the ion translocation activity are colored in blue. The lysine involved in retinal binding is colored in red.

DOI: 10.1128/msystems.00008-23.SuF7
TABLE S5. msystems.00008-23-s0008.xlsx.

Amino acid profile of marine FArhodopsins. Three hundred protein sequences from SAGs and metagenomes were randomly selected (one hundred per each subgroup -- Pelagibacterales, HIMB59, and Pseudothioglobus) and aligned. Each number in the table represents the weight (in percentage) of a given amino acid (row) in a given position in the alignment (column).

DOI: 10.1128/msystems.00008-23.SuF8

ASM does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted ASM a non-exclusive, world-wide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.

REFERENCES

  • 1. Béjà O, Lanyi JK. 2014. Nature’s toolkit for microbial rhodopsin ion pumps. Proc Natl Acad Sci U S A 111:6538–6539. doi: 10.1073/pnas.1405093111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Gushchin I, Gordeliy V. 2018. Microbial rhodopsins. Subcell Biochem 87:19–56. [DOI] [PubMed] [Google Scholar]
  • 3. Rozenberg A, Inoue K, Kandori H, Béjà O. 2021. Microbial rhodopsins: the last two decades. Annu Rev Microbiol 75:427–447. doi: 10.1146/annurev-micro-031721-020452 [DOI] [PubMed] [Google Scholar]
  • 4. Pinhassi J, DeLong EF, Béjà O, González JM, Pedrós-Alió C. 2016. Marine bacterial and archaeal ion-pumping rhodopsins: genetic diversity, physiology, and ecology. Microbiol Mol Biol Rev 80:929–954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Inoue K, Kato Y, Kandori H. 2015. Light-Driven ion-translocating rhodopsins in marine bacteria. Trends Microbiol 23:91–98. doi: 10.1016/j.tim.2014.10.009 [DOI] [PubMed] [Google Scholar]
  • 6. Pushkarev A, Inoue K, Larom S, Flores-Uribe J, Singh M, Konno M, Tomida S, Ito S, Nakamura R, Tsunoda SP, Philosof A, Sharon I, Yutin N, Koonin EV, Kandori H, Béjà O. 2018. A distinct abundant group of microbial rhodopsins discovered using functional metagenomics. Nature 558:595–599. doi: 10.1038/s41586-018-0225-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Kovalev K, Volkov D, Astashkin R, Alekseev A, Gushchin I, Haro-Moreno JM, Chizhov I, Siletsky S, Mamedov M, Rogachev A, Balandin T, Borshchevskiy V, Popov A, Bourenkov G, Bamberg E, Rodriguez-Valera F, Büldt G, Gordeliy V. 2020. High-Resolution structural insights into the heliorhodopsin family. Proc Natl Acad Sci U S A 117:4131–4141. doi: 10.1073/pnas.1915888117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Shim J-G, Cho S-G, Kim S-H, Chuon K, Meas S, Choi A, Jung K-H. 2022. Heliorhodopsin helps photolyase to enhance the DNA repair capacity. Microbiol Spectr 10:e0221522. doi: 10.1128/spectrum.02215-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Hososhima S, Mizutori R, Abe-Yoshizumi R, Rozenberg A, Shigemura S, Pushkarev A, Konno M, Katayama K, Inoue K, Tsunoda SP, Béjà O, Kandori H. 2022. Proton-Transporting heliorhodopsins from marine giant viruses. Elife 11:e78416. doi: 10.7554/eLife.78416 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Cho S-G, Song M, Chuon K, Shim J-G, Meas S, Jung K-H. 2022. Heliorhodopsin binds and regulates glutamine synthetase activity. PLoS Biol 20:e3001817. doi: 10.1371/journal.pbio.3001817 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Becker EA, Yao AI, Seitzer PM, Kind T, Wang T, Eigenheer R, Shao KSY, Yarov-Yarovoy V, Facciotti MT. 2016. A large and phylogenetically diverse class of type 1 opsins lacking a canonical retinal binding site. PLoS One 11:e0156543. doi: 10.1371/journal.pone.0156543 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Yamauchi Y, Konno M, Yamada D, Yura K, Inoue K, Béjà O, Kandori H. 2019. Engineered functional recovery of microbial rhodopsin without retinal-binding lysine. Photochem Photobiol 95:1116–1121. doi: 10.1111/php.13114 [DOI] [PubMed] [Google Scholar]
  • 13. Béjà O, Aravind L, Koonin EV, Suzuki MT, Hadd A, Nguyen LP, Jovanovich SB, Gates CM, Feldman RA, Spudich JL, Spudich EN, DeLong EF. 2000. Bacterial rhodopsin: evidence for a new type of phototrophy in the sea. Science 289:1902–1906. doi: 10.1126/science.289.5486.1902 [DOI] [PubMed] [Google Scholar]
  • 14. Béjà O, Spudich EN, Spudich JL, Leclerc M, DeLong EF. 2001. Proteorhodopsin phototrophy in the ocean. Nature 411:786–789. doi: 10.1038/35081051 [DOI] [PubMed] [Google Scholar]
  • 15. Béjà O, Pinhassi J, Spudich JL. 2013. Proteorhodopsins: widespread microbial light-driven proton pumps, p 280–285. In Encycl biodivers [Google Scholar]
  • 16. Gómez-Consarnau L, Raven JA, Levine NM, Cutter LS, Wang D, Seegers B, Arístegui J, Fuhrman JA, Gasol JM, Sañudo-Wilhelmy SA. 2019. Microbial rhodopsins are major contributors to the solar energy captured in the sea. Sci Adv 5:eaaw8855. doi: 10.1126/sciadv.aaw8855 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Oesterhelt D, Stoeckenius W. 1971. Rhodopsin-Like protein from the purple membrane of Halobacterium halobium. Nat New Biol 233:149–152. doi: 10.1038/newbio233149a0 [DOI] [PubMed] [Google Scholar]
  • 18. Spudich JL, Bogomolni RA. 1984. Mechanism of colour discrimination by a bacterial sensory rhodopsin. Nature 312:509–513. doi: 10.1038/312509a0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Fu H-Y, Lin Y-C, Chang Y-N, Tseng H, Huang C-C, Liu K-C, Huang C-S, Su C-W, Weng RR, Lee Y-Y, Ng WV, Yang C-S. 2010. A novel six-rhodopsin system in a single archaeon. J Bacteriol 192:5866–5873. doi: 10.1128/JB.00642-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Inoue K, Ono H, Abe-Yoshizumi R, Yoshizawa S, Ito H, Kogure K, Kandori H. 2013. A light-driven sodium ion pump in marine bacteria. Nat Commun 4:1678. doi: 10.1038/ncomms2689 [DOI] [PubMed] [Google Scholar]
  • 21. Riedel T, Held B, Nolan M, Lucas S, Lapidus A, Tice H, Del Rio TG, Cheng JF, Han C, Tapia R, Goodwin LA, Pitluck S, Liolios K, Mavromatis K, Pagani I, Ivanova N, Mikhailova N, Pati A, Chen A, Palaniappan K, Land M, Rohde M, Tindall BJ, Detter JC, Göker M, Bristow J, Eisen JA, Markowitz V, Hugenholtz P, Kyrpides NC, Klenk HP, Woyke T. 2012. Genome sequence of the Antarctic rhodopsins-containing Flavobacterium Gillisia limnaea type strain (R-8282 (T)). Stand Genomic Sci 7:107–119. doi: 10.4056/sigs.3216895 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Yoshizawa S, Kumagai Y, Kim H, Ogura Y, Hayashi T, Iwasaki W, DeLong EF, Kogure K. 2014. Functional characterization of flavobacteria rhodopsins reveals a unique class of light-driven chloride pump in bacteria. Proc Natl Acad Sci U S A 111:6732–6737. doi: 10.1073/pnas.1403051111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Govorunova EG, Sineshchekov OA, Li H, Spudich JL. 2017. Microbial rhodopsins: diversity, mechanisms, and optogenetic applications. Annu Rev Biochem 86:845–872. doi: 10.1146/annurev-biochem-101910-144233 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Shevchenko V, Mager T, Kovalev K, Polovinkin V, Alekseev A, Juettner J, Chizhov I, Bamann C, Vavourakis C, Ghai R, Gushchin I, Borshchevskiy V, Rogachev A, Melnikov I, Popov A, Balandin T, Rodriguez-Valera F, Manstein DJ, Bueldt G, Bamberg E, Gordeliy V. 2017. Inward H+ pump xenorhodopsin: mechanism and alternative optogenetic approach. Sci Adv 3:e1603187. doi: 10.1126/sciadv.1603187 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Pachiadaki MG, Brown JM, Brown J, Bezuidt O, Berube PM, Biller SJ, Poulton NJ, Burkart MD, La Clair JJ, Chisholm SW, Stepanauskas R. 2019. Charting the complexity of the marine microbiome through single-cell genomics. Cell 179:1623–1635. doi: 10.1016/j.cell.2019.11.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Olson DK, Yoshizawa S, Boeuf D, Iwasaki W, DeLong EF. 2018. Proteorhodopsin variability and distribution in the North Pacific subtropical Gyre. ISME J 12:1047–1060. doi: 10.1038/s41396-018-0074-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2015. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055. doi: 10.1101/gr.186072.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. 2019. GTDB-tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics 36:1925–1927. doi: 10.1093/bioinformatics/btz848 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Parks DH, Chuvochina M, Chaumeil P-A, Rinke C, Mussig AJ, Hugenholtz P. 2020. Author correction: a complete domain-to-species taxonomy for bacteria and archaea. Nat Biotechnol 38:1098. doi: 10.1038/s41587-020-0539-7 [DOI] [PubMed] [Google Scholar]
  • 30. López-Pérez M, Haro-Moreno JM, Gonzalez-Serrano R, Parras-Moltó M, Rodriguez-Valera F. 2017. Genome diversity of marine phages recovered from Mediterranean metagenomes: size matters. PLoS Genet 13:e1007018. doi: 10.1371/journal.pgen.1007018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Haro-Moreno JM, López-Pérez M, de la Torre JR, Picazo A, Camacho A, Rodriguez-Valera F. 2018. Fine Metagenomic profile of the Mediterranean stratified and mixed water columns revealed by assembly and recruitment. Microbiome 6:128. doi: 10.1186/s40168-018-0513-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Biller SJ, Berube PM, Dooley K, Williams M, Satinsky BM, Hackl T, Hogle SL, Coe A, Bergauer K, Bouman HA, Browning TJ, De Corte D, Hassler C, Hulston D, Jacquot JE, Maas EW, Reinthaler T, Sintes E, Yokokawa T, Chisholm SW. 2018. Marine microbial metagenomes sampled across space and time. Sci Data 5:180176. doi: 10.1038/sdata.2018.176 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Sunagawa S, Coelho LP, Chaffron S, Kultima JR, Labadie K, Salazar G, Djahanschiri B, Zeller G, Mende DR, Alberti A, Cornejo-Castillo FM, Costea PI, Cruaud C, d’Ovidio F, Engelen S, Ferrera I, Gasol JM, Guidi L, Hildebrand F, Kokoszka F, Lepoivre C, Lima-Mendez G, Poulain J, Poulos BT, Royo-Llonch M, Sarmento H, Vieira-Silva S, Dimier C, Picheral M, Searson S, Kandels-Lewis S, Tara Oceans coordinators, Bowler C, de Vargas C, Gorsky G, Grimsley N, Hingamp P, Iudicone D, Jaillon O, Not F, Ogata H, Pesant S, Speich S, Stemmann L, Sullivan MB, Weissenbach J, Wincker P, Karsenti E, Raes J, Acinas SG, Bork P. 2015. Ocean plankton. structure and function of the global ocean microbiome. Science 348:1261359. doi: 10.1126/science.1261359 [DOI] [PubMed] [Google Scholar]
  • 34. Cabello-Yeves PJ, Zemskaya TI, Rosselli R, Coutinho FH, Zakharenko AS, Blinov VV, Rodriguez-Valera F. 2018. Genomes of novel microbial lineages assembled from the sub-ice waters of Lake Baikal. Appl Environ Microbiol 84:e02132-17. doi: 10.1128/AEM.02132-17 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Cabello‐Yeves PJ, Zemskaya TI, Zakharenko AS, Sakirko MV, Ivanov VG, Ghai R, Rodriguez‐Valera F. 2020. Microbiome of the deep lake Baikal, a unique oxic bathypelagic habitat. Limnol Oceanogr 65:1471–1488. doi: 10.1002/lno.11401 [DOI] [Google Scholar]
  • 36. Cabello-Yeves PJ, Haro-Moreno JM, Martin-Cuadrado A-B, Ghai R, Picazo A, Camacho A, Rodriguez-Valera F. 2017. Novel Synechococcus Genomes reconstructed from freshwater reservoirs. Front Microbiol 8:1151. doi: 10.3389/fmicb.2017.01151 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Chiriac M-C, Bulzu P-A, Andrei A-S, Okazaki Y, Nakano S-I, Haber M, Kavagutti VS, Layoun P, Ghai R, Salcher MM. 2022. Ecogenomics sheds light on diverse lifestyle strategies in freshwater CPR. Microbiome 10:84. doi: 10.1186/s40168-022-01274-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Finn RD, Clements J, Eddy SR. 2011. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 39:W29–37. doi: 10.1093/nar/gkr367 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Edgar RC. 2004. Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. doi: 10.1093/nar/gkh340 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. 2009. TrimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25:1972–1973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274. doi: 10.1093/molbev/msu300 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped blast and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402. doi: 10.1093/nar/25.17.3389 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119. doi: 10.1186/1471-2105-11-119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Buchfink B, Xie C, Huson DH. 2015. Fast and sensitive protein alignment using diamond. Nat Methods 12:59–60. doi: 10.1038/nmeth.3176 [DOI] [PubMed] [Google Scholar]
  • 45. Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV. 2001. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res 29:22–28. doi: 10.1093/nar/29.1.22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Haft DH, Loftus BJ, Richardson DL, Yang F, Eisen JA, Paulsen IT, White O. 2001. TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res 29:41–43. doi: 10.1093/nar/29.1.41 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJA, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. 2009. InterPro: the integrative protein signature database. Nucleic Acids Res 37:D211–5. doi: 10.1093/nar/gkn785 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Hallgren J, Tsirigos KD, Pedersen MD, Almagro Armenteros JJ, Marcatili P, Nielsen H, Krogh A, Winther O. 2022. DeepTMHMM predicts alpha and beta transmembrane proteins using deep neural networks. Bioinformatics. doi: 10.1101/2022.04.08.487609 [DOI]
  • 49. Teufel F, Almagro Armenteros JJ, Johansen AR, Gíslason MH, Pihl SI, Tsirigos KD, Winther O, Brunak S, von Heijne G, Nielsen H. 2022. SignalP 6.0 predicts all five types of signal peptides using protein language models. Nat Biotechnol 40:1023–1025. doi: 10.1038/s41587-021-01156-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer ELL, Tosatto SCE, Paladin L, Raj S, Richardson LJ, Finn RD, Bateman A. 2021. Pfam: the protein families database in 2021. Nucleic Acids Res 49:D412–D419. doi: 10.1093/nar/gkaa913 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D. 2021. Highly accurate protein structure prediction with alphafold. Nature 596:583–589. doi: 10.1038/s41586-021-03819-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. 2022. ColabFold: making protein folding accessible to all. Nat Methods 19:679–682. doi: 10.1038/s41592-022-01488-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Bowers RM, Kyrpides NC, Stepanauskas R, Harmon-Smith M, Doud D, Reddy TBK, Schulz F, Jarett J, Rivers AR, Eloe-Fadrosh EA, Tringe SG, Ivanova NN, Copeland A, Clum A, Becraft ED, Malmstrom RR, Birren B, Podar M, Bork P, Weinstock GM, Garrity GM, Dodsworth JA, Yooseph S, Sutton G, Glöckner FO, Gilbert JA, Nelson WC, Hallam SJ, Jungbluth SP, Ettema TJG, Tighe S, Konstantinidis KT, Liu W-T, Baker BJ, Rattei T, Eisen JA, Hedlund B, McMahon KD, Fierer N, Knight R, Finn R, Cochrane G, Karsch-Mizrachi I, Tyson GW, Rinke C, Genome Standards Consortium, Lapidus A, Meyer F, Yilmaz P, Parks DH, Eren AM, Schriml L, Banfield JF, Hugenholtz P, Woyke T. 2017. Minimum information about a single amplified genome (misag) and a metagenome-assembled genome (mimag) of bacteria and archaea. Nat Biotechnol 35:725–731. doi: 10.1038/nbt.3893 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. López-Pérez M, Haro-Moreno JM, Iranzo J, Rodriguez-Valera F. 2020. Genomes of the "Candidatus Actinomarinales" order: Highly streamlined Marine Epipelagic Actinobacteria. mSystems 5:e01041-20. doi: 10.1128/mSystems.01041-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Haro-Moreno JM, Rodriguez-Valera F, Rosselli R, Martinez-Hernandez F, Roda-Garcia JJ, Gomez ML, Fornas O, Martinez-Garcia M, López-Pérez M. 2020. Ecogenomics of the SAR11 clade. Environ Microbiol 22:1748–1763. doi: 10.1111/1462-2920.14896 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Browman DT, Hoegg MB, Robbins SM. 2007. The SPFH domain-containing proteins: more than lipid raft markers. Trends Cell Biol 17:394–402. doi: 10.1016/j.tcb.2007.06.005 [DOI] [PubMed] [Google Scholar]
  • 58. Otto GP, Nichols BJ. 2011. The roles of flotillin microdomains -- endocytosis and beyond. J Cell Sci 124:3933–3940. doi: 10.1242/jcs.092015 [DOI] [PubMed] [Google Scholar]
  • 59. Donovan C, Bramkamp M. 2009. Characterization and subcellular localization of a bacterial flotillin homologue. Microbiology (Reading) 155:1786–1799. doi: 10.1099/mic.0.025312-0 [DOI] [PubMed] [Google Scholar]
  • 60. López D, Kolter R. 2010. Functional microdomains in bacterial membranes. Genes Dev 24:1893–1902. doi: 10.1101/gad.1945010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Bramkamp M, Lopez D. 2015. Exploring the existence of lipid rafts in bacteria. Microbiol Mol Biol Rev 79:81–100. doi: 10.1128/MMBR.00036-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Bach JN, Bramkamp M. 2013. Flotillins functionally organize the bacterial membrane. Mol Microbiol 88:1205–1217. doi: 10.1111/mmi.12252 [DOI] [PubMed] [Google Scholar]
  • 63. Bandyopadhyay K, Parua PK, Datta AB, Parrack P. 2010. Escherichia coli Hflk and Hflc can individually inhibit the HflB (FtsH) -mediated proteolysis of lambdaCII in vitro. Arch Biochem Biophys 501:239–243. doi: 10.1016/j.abb.2010.06.030 [DOI] [PubMed] [Google Scholar]
  • 64. Schär HP, Holzmann W, Ramos Tombo GM, Ghisalba O. 1986. Purification and characterization of N, N-dimethylformamidase from Pseudomonas DMF 3/3. Eur J Biochem 158:469–475. doi: 10.1111/j.1432-1033.1986.tb09778.x [DOI] [PubMed] [Google Scholar]
  • 65. Oke M, Oni O, Bello R, Samuel-Omoyajowo K, Senbadejo T. 2019. Structure-Function relationships of the 5-oxoprolinase subunit A: guiding biological sciences students down the path less traveled. Biochem Mol Biol Educ 47:620–631. doi: 10.1002/bmb.21300 [DOI] [PubMed] [Google Scholar]
  • 66. Haro-Moreno JM, López-Pérez M, Rodriguez-Valera F. 2021. Enhanced recovery of microbial genes and genomes from a marine water column using long-read metagenomics. Front Microbiol 12:708782. doi: 10.3389/fmicb.2021.708782 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Garner RE, Kraemer SA, Onana VE, Huot Y, Walsh DA. 2022. Genome-resolved community structure and function of freshwater bacteria at a Continental scale. BioRxiv. doi: 10.1101/2022.08.12.503676 [DOI]
  • 68. Cabello-Yeves PJ, Callieri C, Picazo A, Mehrshad M, Haro-Moreno JM, Roda-Garcia JJ, Dzhembekova N, Slabakova V, Slabakova N, Moncheva S, Rodriguez-Valera F. 2021. The microbiome of the black sea water column analyzed by shotgun and genome centric metagenomics. Environ Microbiome 16:5. doi: 10.1186/s40793-021-00374-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. López-Pérez M, Haro-Moreno JM, de la Torre JR, Rodriguez-Valera F. 2019. Novel caudovirales associated with marine group I Thaumarchaeota assembled from metagenomes. Environ Microbiol 21:1980–1988. doi: 10.1111/1462-2920.14462 [DOI] [PubMed] [Google Scholar]
  • 70. Ran T, Ozorowski G, Gao Y, Sineshchekov OA, Wang W, Spudich JL, Luecke H. 2013. Cross-protomer interaction with the photoactive site in oligomeric proteorhodopsin complexes. Acta Crystallogr D Biol Crystallogr 69:1965–1980. doi: 10.1107/S0907444913017575 [DOI] [PubMed] [Google Scholar]
  • 71. Goren MA, Morizumi T, Menon I, Joseph JS, Dittman JS, Cherezov V, Stevens RC, Ernst OP, Menon AK. 2014. Constitutive phospholipid scramblase activity of a G protein-coupled receptor. Nat Commun 5:5115. doi: 10.1038/ncomms6115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Kishi KE, Kim YS, Fukuda M, Inoue M, Kusakizako T, Wang PY, Ramakrishnan C, Byrne EFX, Thadhani E, Paggi JM, Matsui TE, Yamashita K, Nagata T, Konno M, Quirin S, Lo M, Benster T, Uemura T, Liu K, Shibata M, Nomura N, Iwata S, Nureki O, Dror RO, Inoue K, Deisseroth K, Kato HE. 2022. Structural basis for channel conduction in the pump-like channelrhodopsin chrmine. Cell 185:672–689. doi: 10.1016/j.cell.2022.01.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Munson-McGee JH, Lindsay MR, Sintes E, Brown JM, D’Angelo T, Brown J, Lubelczyk LC, Tomko P, Emerson D, Orcutt BN, Poulton NJ, Herndl GJ, Stepanauskas R. 2022. Decoupling of respiration rates and abundance in marine prokaryoplankton. Nature 612:764–770. doi: 10.1038/s41586-022-05505-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Giovannoni SJ, Britschgi TB, Moyer CL, Field KG. 1990. Genetic diversity in Sargasso sea bacterioplankton. Nature 345:60–63. doi: 10.1038/345060a0 [DOI] [PubMed] [Google Scholar]
  • 75. Giovannoni SJ, Tripp HJ, Givan S, Podar M, Vergin KL, Baptista D, Bibbs L, Eads J, Richardson TH, Noordewier M, Rappé MS, Short JM, Carrington JC, Mathur EJ. 2005. Genome streamlining in a cosmopolitan oceanic bacterium. Science 309:1242–1245. doi: 10.1126/science.1114057 [DOI] [PubMed] [Google Scholar]
  • 76. Giovannoni SJ. 2017. SAR11 bacteria: the most abundant plankton in the oceans. Ann Rev Mar Sci 9:231–255. doi: 10.1146/annurev-marine-010814-015934 [DOI] [PubMed] [Google Scholar]
  • 77. López-Pérez M, Haro-Moreno JM, Coutinho FH, Martinez-Garcia M, Rodriguez-Valera F. 2020. The evolutionary success of the marine bacterium sar11 analyzed through a metagenomic perspective. mSystems 5:e00605-20. doi: 10.1128/mSystems.00605-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Ghai R, Mizuno CM, Picazo A, Camacho A, Rodriguez-Valera F. 2013. Metagenomics uncovers a new group of low GC and ultra-small marine actinobacteria. Sci Rep 3:2471. doi: 10.1038/srep02471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Dupont CL, Rusch DB, Yooseph S, Lombardo M-J, Richter RA, Valas R, Novotny M, Yee-Greenbaum J, Selengut JD, Haft DH, Halpern AL, Lasken RS, Nealson K, Friedman R, Venter JC. 2012. Genomic insights to sar86, an abundant and uncultivated marine bacterial lineage. ISME J 6:1186–1199. doi: 10.1038/ismej.2011.189 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Hoarfrost A, Nayfach S, Ladau J, Yooseph S, Arnosti C, Dupont CL, Pollard KS. 2020. Global ecotypes in the ubiquitous marine clade sar86. ISME J 14:178–188. doi: 10.1038/s41396-019-0516-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81. Roda-Garcia JJ, Haro-Moreno JM, Rodriguez-Valera F, Almagro-Moreno S, López-Pérez M. 2023. Single-amplified genomes reveal most streamlined free-living marine bacteria. Environ Microbiol. doi: 10.1111/1462-2920.16348 [DOI] [PubMed] [Google Scholar]
  • 82. Giovannoni SJ, Cameron Thrash J, Temperton B. 2014. Implications of streamlining theory for microbial ecology. ISME J 8:1553–1565. doi: 10.1038/ismej.2014.60 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83. Dempwolff F, Möller HM, Graumann PL. 2012. Synthetic motility and cell shape defects associated with deletions of flotillin/reggie paralogs in Bacillus subtilis and interplay of these proteins with nfed proteins. J Bacteriol 194:4652–4661. doi: 10.1128/JB.00910-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84. Broser M, Spreen A, Konold PE, Peter E, Adam S, Borin V, Schapiro I, Seifert R, Kennis JTM, Bernal Sierra YA, Hegemann P. 2020. NeoR, a near-infrared absorbing rhodopsin. Nat Commun 11:5682. doi: 10.1038/s41467-020-19375-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

FIG S1. msystems.00008-23-s0001.pdf.

Complete protein sequence alignment of six proteorhodopsin and six FArhodopsin sequences. Sequences come from six SAGs (two of each taxa) containing both rhodopsins in their genomes. Dark and light grey regions represent conserved residues within the alignment.

DOI: 10.1128/msystems.00008-23.SuF1
FIG S2. msystems.00008-23-s0002.pdf.

Maximum-likelihood phylogenetic tree of flotillin genes retrieved from marine SAGs. Reference flotillin sequences and their accession numbers within brackets are colored in black. Sequences in red represent sequences from marine SAGs that were not found near a FArhodopsin. Sequences in blue correspond to the flotillins of the three taxa coding for the tandem flotillin-FArhodopsin. Sequences in this branch were condensed for simplicity.

DOI: 10.1128/msystems.00008-23.SuF2
FIG S3. msystems.00008-23-s0003.pdf.

Relative abundance (measured in RPKG) of the FAR gene cluster in Tara Ocean metagenomes. Each province has been divided by depth in surface (SRF), deep chlorophyll maximum (DCM), and mesopelagic (MES) regions. The bars have been colored according to the microbes (blue, Pelagibacterales; purple, HIMB59; and green, Pseudothioglobus).

DOI: 10.1128/msystems.00008-23.SuF3
TABLE S1. msystems.00008-23-s0004.xlsx.

Summary of marine SAGs, with their taxonomic classification, and the number of type-I and type-III rhodopsin genes detected.

DOI: 10.1128/msystems.00008-23.SuF4
TABLE S2. msystems.00008-23-s0005.xlsx.

Functional annotation of genes encoded within the FAR gene clusters of Pelagibacterales, HIMB59, and Pseudothioglobus. Only one FAR gene cluster per taxa is shown.

DOI: 10.1128/msystems.00008-23.SuF5
TABLE S3. msystems.00008-23-s0006.xlsx.

Number of rhodopsin sequences per metagenomic data set.

DOI: 10.1128/msystems.00008-23.SuF6
TABLE S4. msystems.00008-23-s0007.xlsx.

Protein alignment of rhodopsin metagenomic sequences. Amino acids involved in the ion translocation activity are colored in blue. The lysine involved in retinal binding is colored in red.

DOI: 10.1128/msystems.00008-23.SuF7
TABLE S5. msystems.00008-23-s0008.xlsx.

Amino acid profile of marine FArhodopsins. Three hundred protein sequences from SAGs and metagenomes were randomly selected (one hundred per each subgroup -- Pelagibacterales, HIMB59, and Pseudothioglobus) and aligned. Each number in the table represents the weight (in percentage) of a given amino acid (row) in a given position in the alignment (column).

DOI: 10.1128/msystems.00008-23.SuF8

Articles from mSystems are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES