Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2017 Jan 3;114(3):E347–E356. doi: 10.1073/pnas.1616234114

Insights into the lifestyle of uncultured bacterial natural product factories associated with marine sponges

Gerald Lackner a,b, Eike Edzard Peters a, Eric J N Helfrich a, Jörn Piel a,1
PMCID: PMC5255618  PMID: 28049838

Significance

The candidate genus “Candidatus Entotheonella” belongs to a recently proposed bacterial candidate phylum with largely unknown properties due to the lack of cultivated members. Among the few known biological properties is an association of Ca. Entotheonella with marine sponges and an extraordinarily rich genomic potential for bioactive natural products with unique structures and unprecedented biosynthetic enzymology. Increasing evidence suggests that Ca. Entotheonella are widespread key producers of sponge natural products with a chemical richness comparable to soil actinomycetes. Given the unusual biology and exceptional pharmacological potential of Ca. Entotheonella, the bioinformatic and functional insights into their lifestyle presented here provide diverse avenues for marine natural product research, biotechnology, and microbial ecology.

Keywords: uncultivated bacteria, natural products, genomics, proteomics, symbiosis

Abstract

The as-yet uncultured filamentous bacteria “Candidatus Entotheonella factor” and “Candidatus Entotheonella gemina” live associated with the marine sponge Theonella swinhoei Y, the source of numerous unusual bioactive natural products. Belonging to the proposed candidate phylum “Tectomicrobia,” Candidatus Entotheonella members are only distantly related to any cultivated organism. The Ca. E. factor has been identified as the source of almost all polyketide and modified peptides families reported from the sponge host, and both Ca. Entotheonella phylotypes contain numerous additional genes for as-yet unknown metabolites. Here, we provide insights into the biology of these remarkable bacteria using genomic, (meta)proteomic, and chemical methods. The data suggest a metabolic model of Ca. Entotheonella as facultative anaerobic, organotrophic organisms with the ability to use methanol as an energy source. The symbionts appear to be auxotrophic for some vitamins, but have the potential to produce most amino acids as well as rare cofactors like coenzyme F420. The latter likely accounts for the strong autofluorescence of Ca. Entotheonella filaments. A large expansion of protein families involved in regulation and conversion of organic molecules indicates roles in host–bacterial interaction. In addition, a massive overrepresentation of members of the luciferase-like monooxygenase superfamily points toward an important role of these proteins in Ca. Entotheonella. Furthermore, we performed mass spectrometric imaging combined with fluorescence in situ hybridization to localize Ca. Entotheonella and some of the bioactive natural products in the sponge tissue. These metabolic insights into a new candidate phylum offer hints on the targeted cultivation of the chemically most prolific microorganisms known from microbial dark matter.


Marine sponges are prolific sources of bioactive natural products and of great interest for drug development (1). Besides their pharmacological potential, sponges are among the oldest metazoans and have attracted attention as ancient models of animal–bacterial symbioses. Many sponges harbor highly abundant bacterial communities that exhibit a similar biological complexity as the human microbiome (2, 3), but the ecological roles of these mostly uncultivated microbes remain largely elusive. One of the functions, for which evidence is accumulating, is the production of toxic natural products that might contribute to host defense (4, 5). Significant efforts have been made to connect the chemistry of sponges to possible bacterial producers, largely motivated by the prospect of developing sustainable production systems for drug development. One of the important sponge models that has emerged in these studies is Theonella swinhoei, a chemically exceptionally rich complex of distinct chemotypes. Pioneering work on a variant from Palau revealed the presence of filamentous, multicellular bacteria that could be mechanically enriched and contained elevated amounts of theopalauamide-type antifungal peptides (6). The symbiont was assigned to a new candidate genus and named “Candidatus Entotheonella palauensis” (7). Related Candidatus Entotheonella bacteria were also detected in the sponge Discodermia dissoluta (8), the source of the anticancer drug candidate discodermolide (9, 10), but their chemical role remains unclear (11). Despite repeated efforts (12, 13), the vast majority of T. swinhoei symbionts, including Ca. Entotheonella, remain uncultured to date, except for a report on the detection of Ca. Entotheonella in a mixed culture (7). There is some prospect, however, of overcoming this challenge by genomics-based targeted cultivation (14).

By metagenomic, single-particle genomic, and functional studies, we and collaborators recently provided evidence that almost all known bioactive polyketides and modified peptides previously reported from a Japanese chemotype of T. swinhoei (termed “chemotype Y”) are produced by a single member of the complex microbiome named “Candidatus Entotheonella factor” TSY1 (1518). In this sponge, the bacterium co-occurs with a second Ca. Entotheonella symbiont (15) that, based on average nucleotide identity values, is a distinct candidate species and was termed “Candidatus Entotheonella gemina” TSY2 (21). Disentangling the metagenomic sequence data by binning analysis revealed a striking number of natural product gene clusters in both phylotypes, but assigned all clusters for attributable T. swinhoei Y compounds (onnamides, polytheonamides, keramamides, pseudotheonamides, cyclotheonamides, and nazumamides) to Ca. E. factor (15). In addition, multiple clusters for as-yet cryptic natural products were identified in both phylotypes, suggesting an even higher biosynthetic capacity. Both phylotypes remain as-yet uncultivated; are only distantly related to any cultivated bacterium; and, on the basis of phylogenomic data, belong to a novel candidate phylum that was termed “Tectomicrobia” (15).

Recently, we presented evidence that another Ca. Entotheonella phylotype, “Candidatus Entotheonella serta,” present in the chemically distinct Japanese sponge T. swinhoei WA, is the producer of the actin-binding polyketide misakinolide (19). Additional Ca. Entotheonella variants were also detected by PCR in a wide range of other sponge species (15) and were connected to natural product biosynthesis in the sponge Discodermia calyx (20, 21). These data and the previous Ca. Entotheonella studies mentioned above (68) suggest a more general relevance for the production of sponge-derived compounds. Biosynthetic pathways from Ca. Entotheonella exhibit a high frequency of unusual enzymatic features, and their chemistry shows little similarity to the chemistry of cultivated bacteria (17, 2224). Being the first example of a chemically prolific taxon among uncultivated bacteria, these producers offer exciting opportunities for pharmaceutical applications, for systematic studies on the ecology of sponge-associated bacteria, and for investigating functional properties of elusive candidate phyla.

Since our first release of the Ca. Entotheonella genomes from T. swinhoei Y (15), we have made several attempts to close gaps in the metagenome by rounds of PacBio and Illumina sequencing. Unfortunately, these attempts did not significantly reduce the number of contigs. Generally, the dataset posed several challenges to genome analysis. First, many of the sequencing gaps disrupted ORFs, which regularly prevented the identification of gene models by automatic annotation pipelines. Furthermore, due to the remote phylogenetic placement of Tectomicrobia (the candidate phylum harboring Ca. Entotheonella), functional predictions were generally more challenging than for strains with a closer relationship to model organisms. Therefore, we performed an extensive reannotation of the genomes and used metaproteomic, metabolic, and imaging methods to provide insights into the intricacies of host–symbiont interactions, metabolic capabilities, nutritional requirements, and genome expansions of Ca. Entotheonella symbionts of T. swinhoei Y. The focus of this study rests outside of classical secondary metabolism. As an ultimate goal, this study is intended to guide the development of cultivation strategies for these high-potential natural product producers.

Results and Discussion

Reannotation of Ca. Entotheonella Genomes and Shotgun Metaproteomics.

Because of the above-mentioned difficulties connected with Ca. Entotheonella genomes, we performed extensive automated and manual reanalyses of genes. All reannotated pathways are listed in Dataset S1. To gain insights into the actual activity of Ca. Entotheonella genes at the time of sponge collection, we aimed to investigate their expression status. Ideally, expression data should be generated from the same organisms that were used for the sequencing project. These specimens had been stored in ethanol at 4 °C for 4 y, and the amount of sample was severely limited. Facing these challenges, we used a peptide fingerprinting approach using nanoflow liquid chromatography (nano-LC) coupled with electrospray ionization tandem mass spectrometry (MS/MS). For sample preparation, we ground 1 cm3 of sponge sample in artificial seawater, filtered off sponge debris, and enriched for the filamentous Ca. Entotheonella bacteria by density gradient centrifugation, yielding a barely visible bacterial pellet. Microscopic inspection of this preparation revealed the abundant presence of large filaments with the typical Ca. Entotheonella morphology, but almost no detectable unicellular contamination (Fig. S1). The pellet was heat-lysed, fractionated by SDS/PAGE, and subjected to nano-LC–MS/MS. Using a Mascot search against the whole National Center for Biotechnology Information (NCBI) database, we identified 557 proteins with over 95% probability. In a more sensitive search against the two Ca. Entotheonella genomes using MaxQuant (25), we detected over 1,661 proteins (1,443 from Ca. E. factor and 886 from Ca. E. gemina, with 668 hits matching to both genomes; Dataset S2). With 8,338 and 8,989 unique annotated proteins encoded in the available Ca. E. factor and Ca. E. gemina genomic data, respectively, these values translate into a proteome coverage of 17.3% and 9.9% of all predicted proteins (at a false discovery rate of 1%). Considering the challenging sample and large genome size of ca. 9 Mb, this proteome coverage is respectable. The lower number of identifications for Ca. E. gemina is most likely due to the lower abundance of Ca. E. gemina in the sample as determined by genome sequencing (15).

Fig. S1.

Fig. S1.

Enriched Ca. Entotheonella cell pellet. Representative phase contrast image of the cell fraction enriched at the 60%:100% (vol/vol) Percoll interface used for the proteomics experiment. (Scale bar: 20 μm.)

Central Carbon and Energy Metabolism.

One important aspect of genomic research on uncultured bacteria is the deduction of metabolic pathways that might deliver valuable information for targeted cultivation experiments. Important clues for successful cultivation are, for instance, potential energy sources, carbon sources, and tolerance to oxygen. Concerning oxygen, the DNA-based data suggest that Ca. E. factor and Ca. E. gemina perform aerobic metabolism and are likely facultative anaerobes. Both strains harbor genes encoding a respiratory chain, including cytochrome c oxidase (expressed) as well as catalase (expressed) and superoxide dismutase (Dataset S1). In addition, anaerobic growth might be promoted, as suggested by putative fermentation pathways to d-lactate, to acetate, and to (R,R)-butanediol from pyruvate (not detected by proteomics). Further support for facultative anaerobic growth is provided by the presence of the oxygen-independent biosynthesis pathway for cobalamin and deoxynucleotides, or by the high number of CoA-transferase III family encoding genes (discussed below), for instance.

Concerning central carbon metabolism (Fig. 1), we found pathways for the utilization of glycerol, acetate, and l-lactate, (the latter two with proteomics support). Accordingly, we also identified putative genes for glycerol and glycerol-3-phosphate transporters (both expressed), and we readily identified gene candidates for a complete TCA cycle and the gluconeogenesis pathway. Additional genes suggested a group of ATP-binding cassette (ABC) transporters that highly likely import monosaccharides according to classification of the ABC transporter database AbcDB (26) and that are expressed based on our proteomics data. Genes for the breakdown of polysaccharides (e.g., for glycogen-debranching enzyme, for glycogen phosphorylase) are also present in both genomes, but their corresponding proteins could not be detected by proteomics. Even if these findings strongly suggest that sugars can be consumed by Ca. Entotheonella, we were not able to identify any complete pathway for the catabolism of monosaccharides. The glycolysis (Embden–Meyerhof–Parnas) pathway is not complete in the Ca. E. factor dataset, because a 6-phosphofructokinase (pfkA or pfkB) gene candidate is missing. There is, however, a partially sequenced gene in Ca. E. gemina (GenBank accession no. ETX08415) that encodes a member of the PfkB family. Thus, this locus might have been missed due to assembly errors or binning problems.

Fig. 1.

Fig. 1.

Metabolic pathway reconstruction of Ca. Entotheonella. Ca. Entotheonella possesses various transporters for inorganic and organic substrates. Given the presence of appropriate transporters, the utilization of sugars is likely. The exact pathway of hexose and pentose catabolism, however, is elusive. Lactate, acetate, and glycerol are likely used as a carbon sources. MDH renders methanol available for energy metabolism. The assimilation of C1 compounds via the serine pathway is plausible; however, some uncertain annotations require experimental confirmation. DH, dehydrogenase; DHAP, dihydroxyacetone phosphate; GAP, glyceraldehyde-6-phosphate; KDG, 2-keto-3-deoxy gluconate; OA, oxaloacetate; PEP, phosphoenolpyruvate; Pfk, phosphofructokinase.

Interestingly, no convincing evidence was obtained for the oxidative portion of the pentose phosphate pathway, including the key enzyme glucose-6-phosphate dehydrogenase (G6PDH). Although several genes encode F420-dependent dehydrogenases, none of them belongs to the trusted family TIGR03554. No complete variant of the Entner–Doudoroff–related pathways (phosphorylative, hemiphosphorylative, or nonphosphorylative) was identified either. There are, however, several genes and their expressed proteins with moderate similarity to d-glucono-1,5-lactonase, a component of the nonphosphorylative or semiphosphorylative Entner–Doudoroff pathways. Although it cannot be excluded that the irregularities observed might be partly due to sequencing or assembly errors, the sugar metabolism of Entotheonella is a very attractive target for further studies, because it might unveil some as-yet unknown biochemistry.

Ca. Entotheonella Are Likely Fueled by Methanol and Oxalate.

More conclusive suggestions on the putative energy metabolism of Ca. Entotheonella were provided by the proteomics experiment: A PQQ-dependent methanol dehydrogenase (MDH; GenBank accession no. ETW97015) ranks among the top 30 most abundant proteins (intensity as to shotgun proteomics; Dataset S2). This finding is typical for methylotrophic organisms, such as Methylobacterium extorquens (27). Intrigued by this result, we constructed a phylogenetic tree of the candidate MDH proteins together with known MDHs classified by Keltjens et al. (28). Both Ca. Entotheonella enzymes cluster well with MDHs (Fig. S2A) and, more specifically, fall into the XoxF2 (lanthanide-dependent methanol dehydrogenase) clade. In contrast to the long-known MxaF-type MDHs that use Ca2+ as cofactor, XoxF-type MDHs have only recently been shown to depend on ions of rare earth elements like La3+ or Ce3+ (29). In addition, they use the rare coenzyme pyrroloquinoline quinone (PQQ). The xoxF gene is clustered with genes for the periplasmic protein XoxJ and the cytochrome c homolog XoxG, a genome context also known from other xoxF homologs (28). These data strongly suggest that Ca. Entotheonella can use MeOH for energy metabolism. Furthermore, the genomes encode two putative enzymes for further oxidation of methanol to carbon dioxide, both depending on rare cofactors (Fig. S2B): mycothiol-dependent formaldehyde dehydrogenase (30) and tungsten-dependent formate dehydrogenase (31). Like MDH, both coding genes are expressed in situ. The data strongly suggest that Ca. E. factor and Ca. E. gemina oxidize methanol for energy metabolism. Notably, methanol is an abundant molecule in the oceans (32). To also feed on C1 compounds as a sole carbon source, C1 fixation pathways have to be present. Examining such pathways revealed a large portion of the serine pathway (33), including serine-hydroxymethyl transferase (expressed) and d-glycerate-2-kinase (Fig. 1). However, dedicated candidate genes of the pathway-specific malyl-CoA synthetase (EC 6.2.1.9, distinct from succinyl-CoA synthetase) were not detected. We are therefore cautious with a final conclusion about C1 compound assimilation in Ca. Entotheonella, but we suggest that the addition of methanol and rare earth elements (e.g., LaCl3) to artificial growth media could be a highly promising strategy for cultivation experiments. Interestingly, the addition of rare earth metals to growth media was a key to cultivating methanotrophic Verrucomicrobia successfully from volcanic mud pods (29).

Fig. S2.

Fig. S2.

Putative methanol utilization enzymes of Ca. Entotheonella. (A) Phylogenetic tree deduced by maximum likelihood analysis of known PQQ-dependent MDHs, including putative MDHs from Ca. E. factor and Ca. E. gemina (bold). Both enzymes are members of the well-defined XoxF family (lanthanide-dependent), more specifically the XoxF2 subfamily. (B) Proposed methanol oxidation pathway in Ca. Entotheonella. DH, dehydrogenase; MSH, mycothiol; ox., oxidized; red., reduced; WPt, tungsten-pterin.

In addition to methylotrophy, there is evidence for oxalotrophy of Ca. Entotheonella, because we found a putative formyl-CoA/oxalate CoA-transferase gene expressed in both Ca. Entotheonella phylotypes. Together with the oxalyl-CoA decarboxylase (ETW96684 and ETW96684, both expressed) and the tungsten-dependent formate dehydrogenase, Ca. Entotheonella could use oxalate as an energy source. Additionally, a putative oxalate/formate antiporter (ETW98700 and ETX08188) could directly generate a proton motive force similar as described for Oxalobacter formigenes (34). Notably, the formation of calcium oxalate has been reported in the marine sponge Chondrosia reniformis (35).

Vitamins for Productive Microbes.

The putative dependence of Ca. Entotheonella on rare cofactors directed us to the investigation of coenzyme biosynthesis pathways. Auxotrophy for cofactors should provide helpful hints for the design of appropriate growth media. In agreement with the presence of enzymes dependent on coenzyme F420, PQQ, and mycothiol (Fig. 2A), we indeed identified biosynthesis pathways for these rare cofactors encoded in the genome. Key enzymes for F420 and PQQ biosynthesis are expressed according to our proteomics data. Because both Ca. Entotheonella candidate species harbor a F420/glutamyl transferase (EC 6.4.2.31), we assume that F420 is present in a polyglutamylated form. The deazaflavin moiety is known to exhibit strong autofluorescence, which is, intriguingly, also a property of a proportion of Ca. Entotheonella cells in our samples. To test whether this fluorescence might be connected to the presence of F420, we recorded fluorescence emission spectra of autofluorescing Ca. Entotheonella filaments and compared them with control spectra obtained from Methanobrevibacter smithii DSM 861, a methanogenic archaeon known to produce large amounts of the coenzyme F420. Gratifyingly, the spectra were nearly identical (Fig. 2B), supporting the hypothesis that Ca. Entotheonella produces F420.

Fig. 2.

Fig. 2.

Rare cofactors of Ca. Entotheonella. (A) Chemical structure of selected rare cofactors encoded in the Ca. Entotheonella genomes. PQQ is a redox cofactor necessary for MDHs. Mycothiol functions as a glutathione-like cofactor in Actinobacteria like Mycobacterium tuberculosis. Coenzyme F420 (a deazariboflavine) is a fluorescent two-electron transfer cofactor mainly found in Actinobacteria and methanogenic archaea. (B) Normalized fluorescence emission spectra of a representative Ca. Entotheonella filament and M. smithii DSM 861, a methanogenic archaeon known to produce F420. The excitation wavelength is 405 nm. The fluorescence emission spectra are nearly identical. (C) Autofluorescence of Ca. Entotheonella filaments: bright-field image of a representative Ca. Entotheonella filament (Left) and fluorescence image of an Ca. Entotheonella filament recorded at 471 nm using an excitation of 405 nm (Right). (Scale bars: 5 μm.)

Concerning more common cofactors, we identified complete pathways for nicotine amide, flavin, folate, pyridoxal phosphate, heme, lipoate, and molybdenum cofactor (Dataset S1). Additionally, the oxygen-independent (early cobalt insertion) pathway for cobalamin (B12) is represented and expressed. Consistently, Ca. Entotheonella uses a B12-dependent ribonucleotide reductase (ETW93474 and ETX06210), which is expected to be functional under both aerobic and anaerobic conditions (36). Curiously, we were not able to identify an obvious route for pantothenate production. This finding is surprising in light of the high number of polyketide synthase (PKS) and nonribosomal peptide synthetase (NRPS) proteins in both Ca. Entotheonella phylotypes. PKS and NRPS megaenzymes contain multiple carrier domains that use a covalently bound phosphopantetheine arm as a prosthetic group. Accordingly, the Ca. Entotheonella genomes contain several genes coding for phosphopantetheinyl transferases (PPTases; pfam01648). Deduced proteins ETX01839 and ETX09163 from Ca. E. factor and Ca. E. gemina, respectively, represent holo-acyl carrier protein synthases (ACPS) activating fatty acid synthases. ETW94202 and ETW96748 are Sfp-type PPTases, which are typically responsible for the activation of assembly line-like enzymes. Another enzyme (ETX03667), which is phylogenetically distinct from the other candidates, is encoded by a gene on plasmid pTSY1. This plasmid is particularly rich in secondary metabolite gene clusters. None of the PPTase proteins, however, could be identified by proteomics.

Also, no evidence for biotin biosynthesis was obtained. Again, this finding is unusual for polyketide-producing organisms, because acetyl-CoA carboxylase depends on biotin for the activation of bicarbonate. Concerning classical quinone cofactors, we could not detect genes for ubiquinone biosynthesis, but parts of the classical menaquinone biosynthesis route. Surprisingly, genes encoding key enzymes of menaquinone biosynthesis, such as chorismate isomerase, were likewise absent in our datasets. Furthermore, thiamine biosynthesis appeared to be incomplete. This pathway most likely starts from hydroxymethyl pyrimidine (HMP), because phosphohydroxymethyl pyrimidine synthase (EC 4.1.99.17) is missing. In perfect agreement with this hypothesis, we discovered a putative HMP importer belonging to the ABC transporter family (specificity was predicted by the AbcDB). Although we cannot rule out that some missing genes might simply reflect sequencing gaps, we suggest that artificial growth media for Ca. Entotheonella should be supplemented with cofactors or vitamins, particularly with thiamine, pantothenate, and biotin.

Although automatic annotations predicted both Ca. Entotheonella phylotypes to be auxotrophic for several amino acids, we were able to identify complete pathways for the biosynthesis of all proteinogenic amino acids by manual searches (Dataset S1). The presence of various amino acid and oligopeptide importers is, of course, not a contradiction to that finding. We therefore hypothesize that the addition of amino acids to growth media could promote growth of Ca. Entotheonella, but might not be necessary.

Bioactive Metabolites.

The extraordinary genetic potential of Ca. Entotheonella to produce bioactive metabolites has been discussed in detail before (15, 21). Unfortunately, we could not identify any of the assembly line-like (PKS or NRPS) natural product biosynthesis enzymes by shotgun proteomics. We could, however, identify PoyA (ETX03681), the precursor peptide of polytheonamides (17). In addition to these analyses, we have discovered genes for carotenoid biosynthesis (Fig. 3A), which are in a different context involved in pathogenicity. The encoded enzymes are highly similar to the homologs of Staphylococcus aureus that generate staphyloxanthin (37). This eponymous yellowish-golden (Latin aureus = golden) pigment was shown to protect the pathogen from oxidative stress (38) and from killing by neutrophil cells (39). For Ca. Entotheonella, a possible role of the carotenoid could be UV-induced stress defense, because the biosynthesis cluster is flanked by a B12-dependent light-sensing transcription factor (40). Thus, Ca. E. factor is likely able to perceive UV light and react to it by pigment formation. The gene loci and a proposed pigment biosynthetic pathway are shown in Fig. 3B. Due to the similarities to staphyloxanthin, we propose the name theoxanthin for the as-yet unknown Ca. Entotheonella carotenoid. The terpenoid precursors are, as usual in bacteria, produced via the nonmevalonate pathway (MEP/DOXP) starting from 1-deoxy-xylulose-5-phosphate (41). Experiments in our laboratory to identify theoxanthin by spectroscopy and MS have not been successful so far.

Fig. 3.

Fig. 3.

Proposed staphyloxanthin-like carotenoid (theoxanthin) biosynthesis pathway in Ca. Entotheonella. (A) Symbolic representation of two gene loci encoding the postulated theoxanthin biosynthesis pathway. Arrows represent coding regions indicating the direction of transcription and are drawn to scale. (Scale bar: 1 kb.) Accession numbers of the deduced proteins are specified within arrows. (B) Hypothetical biosynthetic route leading to theoxanthin. The product theoxanthin is as yet unknown, but all biosynthetic steps resemble corresponding steps in the biosynthesis of staphyloxanthin, the yellow-golden pigment of S. aureus. AT, acyltransferase; FAD, flavin adenine dinucleotide; FPP, farnesyl pyrophosphate; GT, glycosyl transferase; NDP, nucleoside triphosphate.

Because more than 40 bioactive natural products were previously isolated from T. swinhoei Y (4244), we also investigated exemplarily the localization of two PKS-NRPS hybrid products, onnamide A and cyclotheonamide A, by matrix-assisted laser desorption-ionization imaging mass spectrometry (MALDI-IMS). This MS method was recently used to localize misakinolide A spatially in the chemotype T. swinhoei WA from Hachijo-Jima, Japan (21). Subsequent catalyzed reporter deposition fluorescence in situ-hybridization (CARD-FISH) experiments with Ca. Entotheonella-specific probes revealed the same spatial distribution for these bacteria as for misakinolide. Interestingly, we observed here the same highly localized distribution for Ca. Entotheonella, cyclotheonamide A, and onnamide A within the pores, chambers, and exterior part of the yellow chemotype of the sponge (Fig. 4). Collectively, these findings further support the hypothesis that Ca. Entotheonella locally produces these cytotoxic compounds in surface-accessible areas, possibly to protect the host sponge from predators, to assist in killing of eukaryotic prey in chambers, and perhaps to protect the sponge tissue by a compartimentalization-like strategy. The CARD-FISH–based localization to regions facing sea water also support our in silico data that suggest aerobic metabolism.

Fig. 4.

Fig. 4.

Localization of Ca. Entotheonella and the bioactive metabolites cyclotheonamide A and onnamide A. (A) Localization of Ca. Entotheonella by CARD-FISH. Overlay of a bright-field image of representative pores of T. swinhoei Y and a fluorescent image obtained from CARD-FISH labeling of Ca. Entotheonella (Scale bars: 100 μm.) Localization of cyclotheonamide A and onnamide A in the sponge tissue. (B and E) Representative optical image of a thin slice of T. swinhoei Y used for MALDI-IMS before MALDI matrix application. False-color heat map representation of spatial distribution of cyclotheonamide A (C) and onnamide A (F). Overlay of the optical image and the spatial distribution of cyclotheonamide A (D) and onnamide A (G) in the sponge tissue. (Scale bars: 2 mm.)

Symbiosis Factors.

Members of the candidate genus Ca. Entotheonella are associated with various marine sponges and are likely to produce mixtures of defensive compounds for the benefit of their hosts (7, 15, 21). To identify further genetic factors involved in mutualism, we searched the genome for classical virulence factors that are also commonly found in symbiotic bacteria. However, we did not find any typical virulence-related secretion systems, such as type III secretion systems (except for the general secretion pathway), or any other virulence factors listed in the TIGRFAMs database of protein families (TIGRFAM) under TIGRFAM role “pathogenesis.” The lack of classical symbiosis factors sometimes points toward more loose associations with hosts rather than highly intimate symbioses. However, the fact that Ca. Entotheonella bacteria are regularly associated with sponges and have thus far not been isolated from any other source is an indication for true mutualism and suggests as-yet unknown factors involved in recognition, colonization, and survival within the host. Some of the orphan biosynthetic gene clusters of Ca. Entotheonella might produce compounds that act as signaling molecules or host manipulation factors.

Boosted Enzyme Families for Complex Metabolism.

Both Ca. E. factor and Ca. E. gemina possess very large genomes of ca. 9 Mb. However, the reported high numbers of repetitive elements and specialized metabolite biosynthesis clusters alone do not account for this exceptional genome size. We therefore wished to obtain insights as to whether the large genome size is caused by a high number of distinct encoded protein families, an accumulation of known protein families, or the presence of unknown protein families. In the genome of Ca. E. factor (Ca. E. gemina), 6,318 (6,473) of 8,440 (8,990) protein coding sequences (CDSs) were assigned to known families in the protein family (PFAM) database (45). This finding means that roughly 25% (28%) of the CDSs have no hit in the PFAM database. The PFAM database was chosen because the compiled PFAMs are supposed to be nonoverlapping. On the other hand, it should be noted that domains of one protein usually belong to different PFAMs. The 6,318 (6,473) proteins with PFAM hits can be assigned to 2,225 (2,211) distinct PFAMs, indicating that multiplication of a subset of families, rather than the accumulation of many distinct families, is the mechanism behind genome expansion in Ca. Entotheonella. This trend is general, because gene duplication and functional differentiation are more likely than horizontal transfer or de novo evolution of genes. To identify PFAMs that are particularly enriched within Ca. Entotheonella, we compiled a matrix (PFAMs versus genomes) containing the number of all PFAM hits per genome analyzed. To put genome statistics into relation, we chose a set of 100 genomes covering a broad range of diverse phylogenetic groups, lifestyles (e.g., pathogenic, free-living), morphologies, metabolic features, and varying genome sizes (Dataset S3). Thereby, we obtained an abundance matrix of 8,301 PFAMs in 100 genomes (Dataset S4). We ordered the table by abundance in Ca. E. factor to retrieve the 50 most abundant PFAMs (top 50) present in its genome (a short version of the table is shown in Fig. 5). Notably, three PFAMs within the top 50 Ca. E. factors are not enriched in Ca. E. gemina. These PFAMs comprise parts of the NRPS assembly lines: pfam00550 (phosphopantetheine attachment site), pfam00668 (condensation domain), and pfam13745 (HxxPF-repeated domain, unknown function). This finding is in agreement with the previous finding that Ca. E. factor harbors far more NRPS genes than Ca. E. gemina. Inspecting the 50 most abundant PFAMs of Ca. E. factor, we observe a general tendency: The genome size reflects the scope of physiological responses to varying environmental conditions. Consequently transporters like ABC transporter (pfam00005) or major facilitator (pfam07690) and signal transduction systems [e.g., response regulator receiver domain (pfam0007), histidine kinase A (phosphoacceptor) domain (pfam00512), sigma factor 70 (pfam04542)] are enriched in Ca. Entotheonella, but also in other large genomes (e.g., Anabaena, Streptomyces, Myxococcus). To pinpoint protein families that are particularly enriched in Ca. Entotheonella, we adjusted our data matrix by subtraction of the median abundance of each PFAM over the 100 genomes (Dataset S5). Indeed, several PFAMs were particularly abundant in Ca. Entotheonella. One of these PFAMs is pfam02515 (CoA-transferase family III), with around 90 members in each Ca. Entotheonella phylotype, followed by Bordetella pertussis and Streptomyces rapamycinicus, with only 22 and 17 representatives, respectively. CoA-transferase III was originally discovered in anaerobic pathways and is involved in the activation of various organic acids like oxalate, bile acids, or benzylsuccinate for subsequent reactions, such as decarboxylation, β-oxidation, or racemization (46). It is therefore tempting to speculate that Ca. Entotheonella might possess an arsenal of enzymes to break down a wide range of organic acids. Notably, one member of pfam0251 is the previously mentioned putative formyl-CoA/oxalate CoA-transferase presumably involved in oxalotrophy. Hitherto, the record holder in accumulating CoA-transferase family III enzymes is Frankia sp. EuI1c, with 49 sequences (of all 873 organisms listed in the PFAM species distribution section). Thus, both Ca. Entotheonella species significantly surpass this number, with 87 members in Ca. E. factor and 93 members in Ca. E. gemina. Another expanded PFAM with 22 instances in Ca. E. factor and 30 in Ca. E. gemina is the molybdopterin-binding domain of aldehyde dehydrogenase (pfam02738), which represents the large subunit of xanthin dehydrogenase, CO dehydrogenase, or hydroxybenzoyl-CoA dehydrogenase (47). Because the copper-binding motif of CO dehydrogenase is missing in all of the deduced Ca. Entotheonella enzymes, we suspect a role in the breakdown of aromatic compounds. This hypothesis is further supported by the high number of taurine catabolism dioxygenase TauD family proteins (pfam02668). Although TauD itself is involved in utilization of taurine as a sulfur source, other family members are involved in the breakdown of the pesticide 2,4-dichlorophenoxyacetic acid, for instance (48). In conclusion, these data suggest that Ca. Entotheonella is a prolific source of complex and untapped biochemistry involved in the transformation of small organic molecules waiting to be characterized.

Fig. 5.

Fig. 5.

Heat map of the 16 most abundant PFAMs encoded in the Ca. E. factor genome. Numbers in the table represent absolute PFAM counts. A color code was applied to the fields to create a heat map (red, high abundance; green, low abundance). To put values in a meaningful relationship, the abundance of these PFAMS in nine other genomes, as well as maximum and median values, is provided. Notably, Ca. Entotheonella are extremely rich in LLMs and CoA-transferase family III enzymes. The full table (all PFAMS in 100 genomes) is shown in Dataset S4. A description of the genomes is provided in Dataset S3.

Extraordinary Abundance of Luciferase-Like Monooxygenases.

Another intriguing PFAM that is extraordinarily abundant in Ca. Entotheonella is pfam00296, the luciferase-like monooxygenase (LLM). It is the most abundant PFAM in both Ca. E. factor (171 members) and Ca. E. gemina (168 members). Due to the great expansion of this family and the fact that bioluminescence of Ca. Entotheonella filaments has not been observed, it is reasonable to assume that these proteins do not act as true luciferases. Rank 2 in our 100-genome set with 63 LLM genes holds S. rapamycinicus (producer of the immunosuppressant rapamycin), another highly talented producer of specialized metabolites (Fig. 5). This finding might suggest a role of LLMs in secondary metabolism. Notably, members of the PFAM TIGR04020 (natural product biosynthesis LLM domain, a subfamily of pfam00296) are typically found encoded in specialized metabolite gene clusters. However, only four proteins of Ca. E. factor, none of Ca. E. gemina, and two of S. rapamycinicus belong to this particular subfamily. In Ca. E. factor, ETX03783 (OnnC) and ETX03434 (NazB) are proposed hydroxylases and belong to the onnamide and nazumamide clusters, respectively. The other two proteins, ETW93540 and ETX02882, are not part of any identified natural product gene cluster. Another interesting feature of LLMs is the fact that several subfamilies (TIGRFAMs) depend on coenzyme F420 instead of a flavin cofactor (49). In light of the presence of this cofactor in Ca. Entotheonella, we assigned all members of pfam00296 to TIGRFAM subfamilies (Dataset S6). However, only 59 of 171 LLM proteins from Ca. E. factor and 42 of 168 from Ca. E. gemina could be assigned to any TIGRFAM. Consequently, one might expect a considerable proportion of LLMs belonging to novel subfamilies. Furthermore, we identified multiple members of assigned F420-dependent subfamilies, such as the poorly characterized TIGR03619 (33 members in Ca. E. factor and 29 in Ca. E. gemina). Because the identity of G6PDH is enigmatic in Ca. Entotheonella, the idea is tempting that enzymes with G6PDH or related activities might be hidden among the large diversity of LLM proteins. However, as discussed above, a bona fide member of TIGR03557 (F420-dependent G6PDH) is not present in Ca. Entotheonella. Thus, the role of the multifarious LLMs of Ca. Entotheonella remains obscure, and future work is highly warranted to shed more light on this PFAM.

Conclusion

Our analyses suggest that the as-yet uncultured bacteria Ca. E. factor and Ca. E. gemina are organotrophic, aerobic (or at least microaerophilic), facultative anaerobic organisms that are likely able to use a broad range of organic carbon sources like organic acids, alcohols, polyols, polysaccharides, or even complex aromatic compounds. The exact pathways of pentose and hexose degradation, however, remain enigmatic. Ca. Entotheonella members appear to be auxotrophic for biotin, pantothenate, and thiamine, but are able to produce rare cofactors like PQQ, F420, and a mycothiol-like cofactor. They harbor the genetic potential to synthesize most, if not all, amino acids. Proteomics data suggest that a lanthanide-dependent MDH is a highly abundant protein in the cell, thus supporting methanol utilization. Hence, the addition of rare earth elements and methanol to the growth media could be a promising approach for targeted cultivation. As shown for the related E. serta, the producer of misakinolide, Ca. Entotheonella filaments from T. swinhoei Y are located in external and internal sponge tissue regions that face sea-water, and their location is perfectly correlated with metabolites whose biosynthetic pathways were identified in the Ca. E. factor genome. Both Ca. Entotheonella strains possess an extremely broad repertoire of transporters and regulators, suggesting that they might not be restricted to a defined ecological niche, but able to react to varying environmental conditions. Our studies demonstrate that Ca. Entotheonella contain a huge arsenal of presumably novel biochemical pathways most likely not only for the production of bioactive natural products but also for the breakdown of complex organic molecules. These unusual bacteria thus represent a rich resource for research on microbial physiology, bioorganic chemistry, biotechnology, and natural products. Just recently, a novel antibiotic with promising properties, teixobactin, has been discovered from previously uncultured bacteria (50, 51). The study presented here might suggest cultivation strategies that could provide access to natural product sources from a wide range of sponges. Finally, we believe that our approach should be valuable for studies on uncultured organisms in general.

While we were finalizing this paper, an analysis by Liu et al. (52) of two metagenome bins from a T. swinhoei Y sponge collected in the South China Sea was published. Both binned genomes are closely related to those genomes of the Ca. Entotheonella strains described here (99.9% average nucleotide identity), and should thus belong to the same candidate species. In contrast to our analysis, Liu et al. (52) claim that they identified the Calvin–Benson cycle for CO2 fixation (via ribulose-1,5-bisphosphate carboxylase), candidates for crassulacean acid metabolism (a phenomenon exclusively known in plants), a type VI secretion system, and a complete system for endospore formation. Despite extensive manual and automated reinvestigation of our dataset as well as the sequence data provided for the Chinese sponge, we cannot find any evidence for these genomic features.

Materials and Methods

Enrichment of Ca. Entotheonella Cells.

An enriched pellet of filamentous bacteria was prepared as described previously (15). Ca. Entotheonella was then further purified on a 20:60:100 (vol/vol) stepwise Percoll (GE Healthcare) gradient in calcium- and magnesium-free artificial seawater (CMF-ASW). Ten milliliters of each Percoll dilution was layered in a 50-mL Falcon tube on ice. A 0.5-mL cell suspension of the filamentous bacterial fraction was carefully layered on top of the gradient and centrifuged at 250 × g for 20 min. Cell fractions were carefully collected with a syringe and analyzed by phase-contrast microscopy based on cell morphology. The filamentous Ca. Entotheonella fraction was collected at the 60%:100% Percoll interface and stored in CMF-ASW at 4 °C. For phase-contrast microscopy, cells were spotted on a cover slide, dried, covered with mountant (Citifluor Ltd.), and subsequently observed under a Zeiss Axioskop 2 epifluorescence microscope equipped with a 75-W xenon arc lamp (XBO 75) and a 40× Plan Apochromat objective.

Shotgun Proteomics.

The Ca. Entotheonella cell pellet was resuspended in 30 μL of sample buffer, and 10 and 20 μL were loaded onto an acrylamide gel with gradient of 4–12% (wt/vol) acrylamide in 3-(N-morpholino)propanesulfonic acid buffer. The 20-μL lane was cut in eight sections. Sections were cut in small pieces and washed twice with 100 μL of 50% (vol/vol) acetonitrile in 100 mM NH4HCO3 and then washed with 50 μL of acetonitrile (100%). All three supernatants were discarded. Proteins were digested by addition of 20 μL (10 μL for section 1) of sequencing grade trypsin [5 ng/μL in 10 mM Tris, 2 mM CaCl2 (pH 8.2)], and 30 μL of buffer (10 mM Tris, 2 mM CaCl2 (pH 8.2)] and incubated overnight at 37 °C. The supernatant was removed, and gel pieces were extracted with 150 μL of 50% (vol/vol) acetonitrile in H2O containing 0.1% TFA. All supernatants were combined and dried. Samples were dissolved in 20 μL of 0.1% formic acid and transferred to autosampler vials for LC-MS/MS. Five microliters was injected into a NanoAcquity UPLC instrument (Waters, Inc.) connected to a Q Exactive mass spectrometer (Thermo Scientific) equipped with a Digital PicoView source (New Objective). Peptides were trapped on a Symmetry C18 trap column (5 μm, 180 μm × 20 mm; Waters, Inc.) and separated on a BEH300 C18 column (1.7 μm, 75 μm × 150 m; Waters, Inc.) at a flow rate of 250 nL⋅min−1 using a gradient from 1% of solvent B (0.1% formic acid in acetonitrile)/99% of solvent A (0.1% formic acid in water) to 40% (vol/vol) of solvent B/60% of solvent A within 90 min. Mass spectrometer settings were set for a data-dependent analysis, with a precursor scan range of 350–1500 m/z, resolution of 70,000, maximum injection time of 100 ms, and threshold of 3 × 106 and with a fragment ion scan range of 200–2000 m/z, resolution of 35,000, maximum injection time of 120 ms, and threshold 1 × 105. Database searches were performed using Mascot (database: NCBI_nr, all species) and MaxQuant (25) for searching against a database containing only the deduced protein sequences of Ca. Entotheonella.

Single-Cell Fluorescence Emission Spectra Analysis.

For confocal microscopy, cells were dried and mounted in aqueous mounting media (Vectashield; Vector Laboratories, Inc.). Images were acquired using a Zeiss LSM 710 confocal system mounted on a Zeiss AxioObserver.Z1 microscope with an oil immersion Plan-Apochromat 63×/1.4-N.A. differential interference contrast (DIC) objective lens. Excitation was performed using the 405-nm diode laser line, and emission was collected at various emission wavelengths with a photo multiplier tube. Lambda stacks were collected with a step size of 5 nm for Ca. E. factor and, due to photo bleaching, with a step size of 10 nm for M. smithii DSM 861 (American Type Culture Collection 35061) cells. All confocal images were acquired with a pinhole of 600 μm, a frame size of 664 × 664 pixels, and a scan zoom of 3. Image analysis was performed with Zen2012 (www.zeiss.de/corporate/de_de/home.html) and ImageJ (https://imagej.nih.gov/ij/) software.

Localization of Cyclotheonamide A, Onnamide A, and Ca. Entotheonella.

MALDI-IMS experiments and CARD-FISH experiments were performed as described previously (21). Briefly, cryopreserved T. swinhoei Y sponge was cut into 50-μm slices using a microtome (Microm; Thermo Fisher). Representative subsequent sponge tissue slices were then transferred either to stainless-steel target plates suitable for MALDI-IMS experiments or onto microscopy slides for subsequent CARD-FISH analysis. A Sunchrome SunCollect MALDI Spotter was used to cover the tissue slice uniformly with five to six layers of conventional MALDI matrix (1:1 mixture of α-cyano-4-hydroxycinnamic acid and 2,5-dihydroxybenzoic acid; 30 mg/mL in methanol; Sigma-Aldrich). The samples were then measured on a MALDI LTQ-Orbitrap mass spectrometer (Thermo Scientific) in positive mode with a laser energy of 45 μJ, two microscans per step, and a resolution of 60,000×. Data were acquired for a mass range between 500 and 1600 m/z. Using ImageQuest, masses of interest were assigned false colors, and the resulting image was then superimposed onto an optical image of the tissue slice taken before MALDI matrix application.

For CARD-FISH experiments, tissue slices were immobilized on microscopy slides and subsequently fixed in PBS-buffered 4% (wt/vol) paraformaldehyde (4 °C, overnight). After washing with PBS [at room temperature (r.t.)], slices were dehydrated [99% (vol/vol) ethanol, 3 min at r.t.], air-dried (2 h at r.t.), and permeabilized with 10 mg/mL lysozyme (30 min at r.t.). After inactivation of endogenous peroxidases with 0.01 M HCl for 15 min at r.t., slices were washed with H2O, dehydrated (99% ethanol, 5 min at r.t.), and air-dried. The hybridization reaction was performed in hybridization solution [0.9 M NaCl, 20 mM Tris⋅HCl (pH 7.6), 10% (wt/vol) dextran sulfate, 0.05% SDS, 1% nucleic acid blocking reagent (Roche), 0.5 mg/mL herring sperm DNA (Sigma)] containing 55% (vol/vol) formamide and 0.5 ng/μL Ca. Entotheonella-specific, horseradish peroxidase (HRP)-coupled probe ESP219 (5′-CCG CAA GCY CAT CTC AGA CC-3′; BioMers) for 2 h at 35 °C. The tissues slices were then washed once with prewarmed washing buffer [3 mM NaCl, 5 mM EDTA (pH 8.0), 20 mM Tris⋅HCl (pH 7.6), 0.05% SDS] for 30 min at 37 °C and once with (PBS-T) (0.01% Triton X-100). After equilibration of the probe-delivered HRP in PBS for 15 min at r.t., signal amplification took place in amplification buffer [1× PBS (pH 7.6), 2 M NaCl, 20% (wt/vol) dextran sulfate, 0.1% nucleic acid blocking reagent (Roche), 0.0015% H2O2] containing Alexa Fluor 633-labeled tyramide (Life Science) for 2 h at 37 °C in the dark. Subsequently, slices were washed three times in PBS-T for 15 min at r.t., rinsed with H2O, dehydrated, and air-dried. For microscopic analysis, slices were covered with mountant (Citifluor Ltd.) and observed under a Zeiss Axioskop 2 epifluorescence microscope equipped with a 75-W xenon arc lamp (XBO 75), an appropriate filter set for Alexa Fluor 633, and a 10× Plan Apochromat objective.

Genome Reannotation and Pathway Prediction.

The Ca. Entotheonella genome analysis presented in this study is based on the previously reported dataset available at the NCBI, as well as a reannotated version of Ca. E. factor TSY1 [synonym: Ca. Entotheonella sp. TSY1; NCBI accession no. AZHW00000000.1, Integrated Microbial Genomes (IMG) submission ID 35878] and Ca. E. gemina TSY2 (synonym: Ca. Entotheonella sp. TSY2; NCBI accession no. AZHX00000000.1, IMG Submission ID 35877) created by the IMG Expert Review (IMG-ER) system (53). We performed extensive manual reanalyses of genes. For automated gene function and pathway predictions, we used the automatic annotations provided by the IMG-ER platform (53). For manual predictions, we used a combination of searches in the UniProt database (54), as well as a hidden Markov model search in the NCBI Conserved Domains (55), PFAM (45), and TIGRFAMs (56) databases. Reference metabolic pathways were retrieved from Metacyc (57) and the Kyoto Encyclopedia of Genes and Genomes (58). For the prediction of ABC transporter families, the blast function of the ABC transporter database ABCdb (26) was used.

PFAM Abundance Analysis.

A set of 100 genomes representing a wide range of phyla and ecological lifestyles was chosen for a function profile analysis in IMG-ER (Dataset S3). All available functions from the PFAM database (16,230 entries) were selected. All noninformative rows (i.e., PFAMS not present in the dataset) were removed. To obtain a median-centered matrix, the median abundance over 100 genomes of each row was computed and subtracted from each cell.

Molecular Phylogenetic Analyses.

Selected protein sequences of known MDHs described by Keltjens et al. (28) were obtained from the NCBI. Primary sequences were aligned using the MUSCLE algorithm (59) implemented in MEGA6 (60). Evolutionary analyses were also conducted in MEGA6 using the maximum likelihood method based on the model of Le and Gascuel (61). A discrete gamma distribution, including invariable sites, was used to model evolutionary rate differences among sites.

Supplementary Material

Supplementary File
pnas.1616234114.sd01.xls (138.5KB, xls)
Supplementary File
pnas.1616234114.sd02.xlsx (491.5KB, xlsx)
Supplementary File
pnas.1616234114.sd03.xls (75.5KB, xls)
Supplementary File
Supplementary File
Supplementary File
pnas.1616234114.sd06.xlsx (35.2KB, xlsx)

Acknowledgments

We thank Shigeki Matsunaga, Kentaro Takada (University of Tokyo), and Toshiyuki Wakimoto (Hokkaido University) for sponge samples; Micheal Wilson [Eidgenössische Technische Hochschule Zurich (ETH Zurich)] for isolation of Ca. Entotheonella cells from sponge samples; Bidong Nguyen (ETH Zurich) for providing a culture of M. smithii; Tobias Schwarz (ScopeM, ETH Zurich) for technical support in confocal microscopy; Peter Hunziker (Functional Genomics Center Zurich) for proteomics experiments; and Tobias Erb (Max Planck Institute) and Julia Vorholt (ETH Zurich) for valuable discussions on methanol metabolism. This work was supported by grants from the Swiss National Foundation (205321_165695 to J.P.), European Union (BluePharmTrain to J.P.), and Alexander von Humboldt Foundation (to G.L.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1616234114/-/DCSupplemental.

References

  • 1.Blunt JW, et al. Marine natural products. Nat Prod Rep. 2009;26(2):170–244. doi: 10.1039/b805113p. [DOI] [PubMed] [Google Scholar]
  • 2.Hentschel U, Piel J, Degnan SM, Taylor MW. Genomic insights into the marine sponge microbiome. Nat Rev Microbiol. 2012;10(9):641–654. doi: 10.1038/nrmicro2839. [DOI] [PubMed] [Google Scholar]
  • 3.Taylor MW, Radax R, Steger D, Wagner M. Sponge-associated microorganisms: Evolution, ecology, and biotechnological potential. Microbiol Mol Biol Rev. 2007;71(2):295–347. doi: 10.1128/MMBR.00040-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bewley CA, Faulkner DJ. Lithistid sponges: Star performers or hosts to the stars. Angew Chem Int Ed. 1998;37(16):2163–2178. doi: 10.1002/(SICI)1521-3773(19980904)37:16<2162::AID-ANIE2162>3.0.CO;2-2. [DOI] [PubMed] [Google Scholar]
  • 5.Gurgui C, Piel J. Metagenomic approaches to identify and isolate bioactive natural products from microbiota of marine sponges. Methods Mol Biol. 2010;668:247–264. doi: 10.1007/978-1-60761-823-2_17. [DOI] [PubMed] [Google Scholar]
  • 6.Bewley CA, Holland ND, Faulkner DJ. Two classes of metabolites from Theonella swinhoei are localized in distinct populations of bacterial symbionts. Experientia. 1996;52(7):716–722. doi: 10.1007/BF01925581. [DOI] [PubMed] [Google Scholar]
  • 7.Schmidt EW, Obraztsova AY, Davidson SK, Faulkner DJ, Haygood MG. Identification of the antifungal peptide-containing symbiont of the marine sponge Theonella swinhoei as a novel delta-proteobacterium, “Candidatus Entotheonella palauensis”. Mar Biol. 2000;136(6):969–977. [Google Scholar]
  • 8.Brück WM, Sennett SH, Pomponi SA, Willenz P, McCarthy PJ. Identification of the bacterial symbiont Entotheonella sp. in the mesohyl of the marine sponge Discodermia sp. ISME J. 2008;2(3):335–339. doi: 10.1038/ismej.2007.91. [DOI] [PubMed] [Google Scholar]
  • 9.Kalesse M. The chemistry and biology of discodermolide. ChemBioChem. 2000;1(3):171–175. doi: 10.1002/1439-7633(20001002)1:3<171::AID-CBIC171>3.0.CO;2-D. [DOI] [PubMed] [Google Scholar]
  • 10.Gunasekera SP, Gunasekera M, Longley RE, Schulte GK. Discodermolide—a new bioactive polyhydroxylated lactone from the marine sponge Discodermia dissoluta. J Org Chem. 1990;55(16):4912–4915. [Google Scholar]
  • 11.Schirmer A, et al. Metagenomic analysis reveals diverse polyketide synthase gene clusters in microorganisms associated with the marine sponge Discodermia dissoluta. Appl Environ Microbiol. 2005;71(8):4840–4849. doi: 10.1128/AEM.71.8.4840-4849.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Keren R, Lavy A, Ilan M. Increasing the richness of culturable arsenic-tolerant bacteria from Theonella swinhoei by addition of sponge skeleton to the growth medium. Microb Ecol. 2016;71(4):873–886. doi: 10.1007/s00248-015-0726-0. [DOI] [PubMed] [Google Scholar]
  • 13.Keren R, Lavy A, Mayzel B, Ilan M. Culturable associated-bacteria of the sponge Theonella swinhoei show tolerance to high arsenic concentrations. Front Microbiol. 2015;6:154. doi: 10.3389/fmicb.2015.00154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lavy A, Keren R, Haber M, Schwartz I, Ilan M. Implementing sponge physiological and genomic information to enhance the diversity of its culturable associated bacteria. FEMS Microbiol Ecol. 2014;87(2):486–502. doi: 10.1111/1574-6941.12240. [DOI] [PubMed] [Google Scholar]
  • 15.Wilson MC, et al. An environmental bacterial taxon with a large and distinct metabolic repertoire. Nature. 2014;506(7486):58–62. doi: 10.1038/nature12959. [DOI] [PubMed] [Google Scholar]
  • 16.Piel J, et al. Antitumor polyketide biosynthesis by an uncultivated bacterial symbiont of the marine sponge Theonella swinhoei. Proc Natl Acad Sci USA. 2004;101(46):16222–16227. doi: 10.1073/pnas.0405976101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Freeman MF, et al. Metagenome mining reveals polytheonamides as posttranslationally modified ribosomal peptides. Science. 2012;338(6105):387–390. doi: 10.1126/science.1226121. [DOI] [PubMed] [Google Scholar]
  • 18.Freeman MF, Helf MJ, Bhushan A, Morinaka BI, Piel J. Seven enzymes create extraordinary molecular complexity in an uncultivated bacterium. Nat Chem. November 28, 2016 doi: 10.1038/nchem.2666. [DOI] [PubMed] [Google Scholar]
  • 19.Ueoka R, et al. Metabolic and evolutionary origin of actin-binding polyketides from diverse organisms. Nat Chem Biol. 2015;11(9):705–712. doi: 10.1038/nchembio.1870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wakimoto T, et al. Calyculin biogenesis from a pyrophosphate protoxin produced by a sponge symbiont. Nat Chem Biol. 2014;10(8):648–655. doi: 10.1038/nchembio.1573. [DOI] [PubMed] [Google Scholar]
  • 21.Nakashima Y, Egami Y, Kimura M, Wakimoto T, Abe I. Metagenomic analysis of the sponge Discodermia reveals the production of the cyanobacterial natural product kasumigamide by ‘Entotheonella’. PLoS One. 2016;11(10):e0164468. doi: 10.1371/journal.pone.0164468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Freeman MF, Vagstad AL, Piel J. Polytheonamide biosynthesis showcasing the metabolic potential of sponge-associated uncultivated ‘Entotheonella’ bacteria. Curr Opin Chem Biol. 2016;31:8–14. doi: 10.1016/j.cbpa.2015.11.002. [DOI] [PubMed] [Google Scholar]
  • 23.Morinaka BI, et al. Radical S-adenosyl methionine epimerases: Regioselective introduction of diverse D-amino acid patterns into peptide natural products. Angew Chem Int Ed Engl. 2014;53(32):8503–8507. doi: 10.1002/anie.201400478. [DOI] [PubMed] [Google Scholar]
  • 24.Wilson MC, Piel J. Metagenomic approaches for exploiting uncultivated bacteria as a resource for novel biosynthetic enzymology. Chem Biol. 2013;20(5):636–647. doi: 10.1016/j.chembiol.2013.04.011. [DOI] [PubMed] [Google Scholar]
  • 25.Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol. 2008;26(12):1367–1372. doi: 10.1038/nbt.1511. [DOI] [PubMed] [Google Scholar]
  • 26.Fichant G, Basse MJ, Quentin Y. ABCdb: An online resource for ABC transporter repertories from sequenced archaeal and bacterial genomes. FEMS Microbiol Lett. 2006;256(2):333–339. doi: 10.1111/j.1574-6968.2006.00139.x. [DOI] [PubMed] [Google Scholar]
  • 27.Delmotte N, et al. Community proteogenomics reveals insights into the physiology of phyllosphere bacteria. Proc Natl Acad Sci USA. 2009;106(38):16428–16433. doi: 10.1073/pnas.0905240106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Keltjens JT, Pol A, Reimann J, Op den Camp HJ. PQQ-dependent methanol dehydrogenases: Rare-earth elements make a difference. Appl Microbiol Biotechnol. 2014;98(14):6163–6183. doi: 10.1007/s00253-014-5766-8. [DOI] [PubMed] [Google Scholar]
  • 29.Pol A, et al. Rare earth metals are essential for methanotrophic life in volcanic mudpots. Environ Microbiol. 2014;16(1):255–264. doi: 10.1111/1462-2920.12249. [DOI] [PubMed] [Google Scholar]
  • 30.Misset-Smits M, van Ophem PW, Sakuda S, Duine JA. Mycothiol, 1-O-(2′-[N-acetyl-L-cysteinyl]amido-2′-deoxy-alpha-D-glucopyranosyl)-D- myo-inositol, is the factor of NAD/factor-dependent formaldehyde dehydrogenase. FEBS Lett. 1997;409(2):221–222. doi: 10.1016/s0014-5793(97)00510-3. [DOI] [PubMed] [Google Scholar]
  • 31.Maia LB, Moura JJ, Moura I. Molybdenum and tungsten-dependent formate dehydrogenases. J Biol Inorg Chem. 2015;20(2):287–309. doi: 10.1007/s00775-014-1218-2. [DOI] [PubMed] [Google Scholar]
  • 32.Yang M, et al. Atmospheric deposition of methanol over the Atlantic Ocean. Proc Natl Acad Sci USA. 2013;110(50):20034–20039. doi: 10.1073/pnas.1317840110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Chistoserdova L, Chen SW, Lapidus A, Lidstrom ME. Methylotrophy in Methylobacterium extorquens AM1 from a genomic point of view. J Bacteriol. 2003;185(10):2980–2987. doi: 10.1128/JB.185.10.2980-2987.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Sidhu H, et al. DNA sequencing and expression of the formyl coenzyme A transferase gene, frc, from Oxalobacter formigenes. J Bacteriol. 1997;179(10):3378–3381. doi: 10.1128/jb.179.10.3378-3381.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Cerrano C, et al. Calcium oxalate production in the marine sponge Chondrosia reniformis. Mar Ecol Prog Ser. 1999;179:297–300. [Google Scholar]
  • 36.Poole AM, Logan DT, Sjöberg BM. The evolution of the ribonucleotide reductases: Much ado about oxygen. J Mol Evol. 2002;55(2):180–196. doi: 10.1007/s00239-002-2315-3. [DOI] [PubMed] [Google Scholar]
  • 37.Pelz A, et al. Structure and biosynthesis of staphyloxanthin from Staphylococcus aureus. J Biol Chem. 2005;280(37):32493–32498. doi: 10.1074/jbc.M505070200. [DOI] [PubMed] [Google Scholar]
  • 38.Clauditz A, Resch A, Wieland KP, Peschel A, Götz F. Staphyloxanthin plays a role in the fitness of Staphylococcus aureus and its ability to cope with oxidative stress. Infect Immun. 2006;74(8):4950–4953. doi: 10.1128/IAI.00204-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Liu GY, et al. Staphylococcus aureus golden pigment impairs neutrophil killing and promotes virulence through its antioxidant activity. J Exp Med. 2005;202(2):209–215. doi: 10.1084/jem.20050846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Kutta RJ, et al. The photochemical mechanism of a B12-dependent photoreceptor protein. Nat Commun. 2015;6:7907. doi: 10.1038/ncomms8907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Rohmer M. The discovery of a mevalonate-independent pathway for isoprenoid biosynthesis in bacteria, algae and higher plants. Nat Prod Rep. 1999;16(5):565–574. doi: 10.1039/a709175c. [DOI] [PubMed] [Google Scholar]
  • 42.Hamada T, Sugawara T, Matsunaga S, Fusetani N. Bioactive marine metabolism. 56. Polytheonamides, unprecedented highly cytotoxic polypeptides from the marine sponge Theonella swinhoei. 2. Structure elucidation. Tetrahedron Lett. 1994;35(4):609–612. [Google Scholar]
  • 43.Tsukamoto S, Matsunaga S, Fusetani N, Toh-e A. Theopederins F-J: Five new antifungal and cytotoxic metabolites from the marine sponge, Theonella swinhoei. Tetrahedron. 1999;55(48):13697–13702. [Google Scholar]
  • 44.Fusetani N, Matsunaga S. Bioactive sponge peptides. Chem Rev. 1993;93(5):1793–1806. [Google Scholar]
  • 45.Finn RD, et al. The Pfam protein families database: Towards a more sustainable future. Nucleic Acids Res. 2016;44(D1):D279–D285. doi: 10.1093/nar/gkv1344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Heider J. A new family of CoA-transferases. FEBS Lett. 2001;509(3):345–349. doi: 10.1016/s0014-5793(01)03178-7. [DOI] [PubMed] [Google Scholar]
  • 47.Boll M, et al. Redox centers of 4-hydroxybenzoyl-CoA reductase, a member of the xanthine oxidase family of molybdenum-containing enzymes. J Biol Chem. 2001;276(51):47853–47862. doi: 10.1074/jbc.M106766200. [DOI] [PubMed] [Google Scholar]
  • 48.Suwa Y, et al. Characterization of a chromosomally encoded 2,4-dichlorophenoxyacetic acid/alpha-ketoglutarate dioxygenase from Burkholderia sp. strain RASC. Appl Environ Microbiol. 1996;62(7):2464–2469. doi: 10.1128/aem.62.7.2464-2469.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Selengut JD, Haft DH. Unexpected abundance of coenzyme F(420)-dependent enzymes in Mycobacterium tuberculosis and other actinobacteria. J Bacteriol. 2010;192(21):5788–5798. doi: 10.1128/JB.00425-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Ling LL, et al. A new antibiotic kills pathogens without detectable resistance. Nature. 2015;517(7535):455–459. doi: 10.1038/nature14098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Piddock LJ. Teixobactin, the first of a new class of antibiotics discovered by iChip technology? J Antimicrob Chemother. 2015;70(10):2679–2680. doi: 10.1093/jac/dkv175. [DOI] [PubMed] [Google Scholar]
  • 52.Liu F, Li J, Feng G, Li Z. New genomic insights into “Entotheonella” symbionts in Theonella swinhoei: Mixotrophy, anaerobic adaptation, resilience, and interaction. Front Microbiol. 2016;7(1333):1333. doi: 10.3389/fmicb.2016.01333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Markowitz VM, et al. IMG: The Integrated Microbial Genomes database and comparative analysis system. Nucleic Acids Res. 2012;40(Database issue):D115–D122. doi: 10.1093/nar/gkr1044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.UniProt Consortium UniProt: A hub for protein information. Nucleic Acids Res. 2015;43(Database issue):D204–D212. doi: 10.1093/nar/gku989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Marchler-Bauer A, et al. CDD: A Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 2011;39(Database issue):D225–D229. doi: 10.1093/nar/gkq1189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Haft DH, et al. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41(Database issue):D387–D395. doi: 10.1093/nar/gks1234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Caspi R, et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res. 2014;42(Database issue):D459–D471. doi: 10.1093/nar/gkt1103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2016;44(D1):D457–D462. doi: 10.1093/nar/gkv1070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Edgar RC. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol. 2013;30(12):2725–2729. doi: 10.1093/molbev/mst197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Le SQ, Gascuel O. An improved general amino acid replacement matrix. Mol Biol Evol. 2008;25(7):1307–1320. doi: 10.1093/molbev/msn067. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.1616234114.sd01.xls (138.5KB, xls)
Supplementary File
pnas.1616234114.sd02.xlsx (491.5KB, xlsx)
Supplementary File
pnas.1616234114.sd03.xls (75.5KB, xls)
Supplementary File
Supplementary File
Supplementary File
pnas.1616234114.sd06.xlsx (35.2KB, xlsx)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES