Skip to main content
Nature Portfolio logoLink to Nature Portfolio
. 2021 May 10;5(7):927–938. doi: 10.1038/s41559-021-01448-6

The genome of Nautilus pompilius illuminates eye evolution and biomineralization

Yang Zhang 1,2,3,#, Fan Mao 1,2,3,#, Huawei Mu 4,#, Minwei Huang 1,2,3,#, Yongbo Bao 5,#, Lili Wang 6,#, Nai-Kei Wong 1, Shu Xiao 1,2,3, He Dai 6, Zhiming Xiang 1,2,3, Mingli Ma 6, Yuanyan Xiong 7, Ziwei Zhang 7, Lvping Zhang 1,2,3, Xiaoyuan Song 4, Fan Wang 6, Xiyu Mu 6, Jun Li 1,2,3, Haitao Ma 1,2,3, Yuehuan Zhang 1,2,3, Hongkun Zheng 6, Oleg Simakov 8, Ziniu Yu 1,2,3,
PMCID: PMC8257504  PMID: 33972735

Abstract

Nautilus is the sole surviving externally shelled cephalopod from the Palaeozoic. It is unique within cephalopod genealogy and critical to understanding the evolutionary novelties of cephalopods. Here, we present a complete Nautilus pompilius genome as a fundamental genomic reference on cephalopod innovations, such as the pinhole eye and biomineralization. Nautilus shows a compact, minimalist genome with few encoding genes and slow evolutionary rates in both non-coding and coding regions among known cephalopods. Importantly, multiple genomic innovations including gene losses, independent contraction and expansion of specific gene families and their associated regulatory networks likely moulded the evolution of the nautilus pinhole eye. The conserved molluscan biomineralization toolkit and lineage-specific repetitive low-complexity domains are essential to the construction of the nautilus shell. The nautilus genome constitutes a valuable resource for reconstructing the evolutionary scenarios and genomic innovations that shape the extant cephalopods.

Subject terms: Evolution, Genomics


Nautilus, the sole surviving externally shelled cephalopod from the Palaeozoic, holds an important phylogenetic position to understand the evolution of cephalopods. A complete genome of Nautilus pompilius sheds light on the evolution of the pinhole eye and biomineralization.

Main

Nautilus is the only surviving externally shelled cephalopod among hundreds of extinct cephalopod genera since the Palaeozoic; it is deemed unique for its persistent ancestral features despite a long evolutionary history1. Palaeobiological evidence shows that the nautilus lineage has preserved plesiomorphic phenotypes such as a chambered shell and primary lens-less eye (pinhole eye)2. A phenotypic peculiarity of the adult nautilus shell is that it consists of over 30 chambers: the soft body is accommodated and protected in the outermost chamber, whereas the remaining chambers act as a constant volume hydrostatic apparatus to maintain buoyancy. Moreover, the elegant architecture of the nautilus chambered shell takes the form of a logarithmic spiral conforming to the golden ratio and is composed of sturdy arrays of aragonite crystals, leading to its high degree of hydrostatic stability3. Nautilus possesses a unique and simple pinhole eye without lens or cornea, which provides an excellent prototypical model for illuminating the evolution of the eye. Additionally, nautilus is adept in spatial learning and temporally separated biphasic memory even though its brain is disproportionately simple among extant cephalopods4,5. As a sister group to nautilus, coleoid cephalopods (such as the octopus, squid and cuttlefish) are perhaps the most intelligent and extraordinarily complex invertebrates with striking morphological and behavioural innovations including sophisticated camera eye, external shell internalization, unusual learning and problem-solving abilities68. Thus, investigating the nautilus genome could furnish valuable insights into the evolutionary drivers of cephalopod innovations.

Recently, genomic sequencing efforts in coleoids revealed that specific gene family expansions and genome rearrangements may drive the evolution of morphological novelties in these organisms912. Moreover, transcriptomic analyses have pointed out that RNA editing could allow high plasticity of transcripts, which is associated with thermal adaptation and neural functions13,14. However, genomic sequence availability is still limited in coleoid species912 and a non-coleoid cephalopod genome is urgently needed. In this study, we sequenced the complete genome of Nautilus pompilius in the hope of providing a critical reference for the evolution of cephalopods.

N. pompilius is the most widespread species among nautiluses and has distributions in the Indo-Pacific region15. However, its population has recently declined dramatically due to a mix of unfavourable circumstances, including commercial exploitation of ornamental shells, a lack of legal protection and very slow sexual maturation16. Therefore, genome studies of N. pompilius would not only shed light on the origin and evolution of cephalopod genomic novelties but also incentivize research on their biology and inform sustainable conservation. Our analyses reveal that the nautilus genome is the smallest when compared to published genomes of coleoid cephalopods; it contains the least number of encoding genes and hitherto the lowest evolutionary rate in the group. Comparative genomics analysis revealed that co-evolution of gene losses and gene family contraction are associated with pinhole eye formation in nautilus, suggesting plausible degeneration from a more complex organ. The unique and new protein-encoding genes in shell formation contribute to the production of aragonite crystals, a major component of the nautilus shell. Moreover, lineage-specific expansion of gene families implicates the active operation of distinct evolutionary strategies of innate immune defence in different cephalopods.

Results

Genomic architecture of N. pompilius

The N. pompilius genome was sequenced with 112.5 coverage of PacBio sequencing reads and 81.8 coverage of Illumina sequencing reads. After de novo assembly via a hybrid approach, these reads were assembled into a 730.58-megabase (Mb) genome with a contig N50 of 1.1 Mb (Supplementary Table 1), which is approximately equal to the estimated genome size of 753.09 Mb by k-mer analysis (Supplementary Fig. 1). Integrity of the assembly is demonstrated by 96.83–97.01% of sequencing reads mapping (Supplementary Table 2) and 91.31% of Benchmarking Universal Single-Copy Orthologs (BUSCO) completeness (Supplementary Table 3). The N. pompilius genome is the smallest among the cephalopods sequenced so far, accounting for only 13.8–41.2% of recently available coleoid genomes (Supplementary Fig. 2)912. One of the main and ubiquitous genomic components, repetitive elements including transposable elements (TEs), are the driving force in shaping genomic architecture and evolution1719. Comparative analysis further revealed that the make-up of TEs in N. pompilius is strikingly different to coleoid lineages (Fig. 1a and Supplementary Table 4). In the N. pompilius genome, TEs make up about 30.95% of the genome where class II DNA transposons predominate (15.55%) whereas class I retrotransposons (long interspersed nuclear element (LINE), long terminal repeat (LTR) and short interspersed nuclear element (SINE)) constitute a minor portion of the genome (6.48%). Retrotransposons were a prominent presence in coleoid cephalopods912. Furthermore, Kimura distance-based copy divergence analysis indicates that the ancient DNA transposon burst event appeared once; no recent TEs expanded in the N. pompilius genome (Fig. 1b and Supplementary Fig. 3). In contrast, retrotransposon (LINE and LTR) bursts were observed in coleoid cephalopods (Extended Data Fig. 1 and Table 5), corroborating the critical role of retrotransposons in driving coleoid genome evolution19. Therefore, higher proportions of DNA elements and absence of characteristics of retrotransposon expansions make the nautilus genome surprisingly more similar to other molluscan genomes, such as that of Lottia gigantea, which is suggestive of slow evolutionary rates in the non-coding regions in nautilus lineages. Moreover, we also examined the evolutionary rates of the coding region in cephalopods based on Tajima’s relative rate test, which revealed slow evolutionary rates in the coding regions of N. pompilius (Supplementary Table 6). Consistently, based on the branch lengths of the neutral tree (Supplementary Fig. 4) and actual distances to the out-group (Supplementary Table 7), smaller pairwise distances from N. pompilius to L. gigantea (4.969 fourfold degenerate (4D) substitutions per site) relative to other coleoid cephalopods to L. gigantea (5.132–5.211 4D substitutions per site) were observed. N. pompilius apparently experienced fewer intron gains or losses than other coleoid cephalopods after its divergence from the cephalopod ancestor (Supplementary Fig. 4), lending support to its slow-evolving features.

Fig. 1. Genomic structure of the N. pompilius genome and cephalopod phylogeny.

Fig. 1

a, Proportions of DNA transposons, LTR, LINE and SINE retrotransposons in the genomes of five representative cephalopods including N. pompilius, E. scolopes, O. bimaculoides, O. minor and O. vulgaris. The tree delineates the evolutionary relationships among the five cephalopod species. The pie charts are scaled according to genome size (Supplementary Fig. 2). b, History of TE accumulation in the N. pompilius genome. c, A phylogenetic tree was constructed with 423 orthologues from 16 metazoan animals using OrthoMCL with a Markov cluster algorithm. Divergence time was estimated with the approximate likelihood calculation method in conjunction with a molecular clock model. A bar within a branch indicates the 95% confidence interval of divergent time. The positive and negative numbers adjacent to the taxon names are gene family numbers of expansion/contraction obtained from the CAFE analysis. d, Demographic history of cephalopods. Historical effective population size (Ne) was estimated by using the PSMC method. The synonymous mutation rate per base per year in N. pompilius was inferred based on the formula T = ks/(2λ), with a generation time of 15 years. The synonymous mutation rate of N. pompilius was estimated as 2.77 × 10−9 and that of other cephalopods as 4.07 × 10−9. Estimation was performed with 100 bootstraps. Pivotal turning points in environmental evolution during the last million years are labelled with blue arrows.

Extended Data Fig. 1. Distribution of the divergence rate of each type of repetitive.

Extended Data Fig. 1

Historical transposable element (TE) divergence was compared in the Octopus bimaculoides, Octopus minor, Octopus vulgaris, Euprymna scolopes, Lottia gigantean, and Nautilus pompilius, which were calculated by the Kimura distance-based copy divergence analysis.

Another cardinal feature of the N. pompilius genome is that it encodes relatively fewer genes than the genome of other cephalopods. Whole-genome annotation articulates 17,710 protein-coding genes through integrating multiple methods (Supplementary Fig. 5, Extended Data Fig. 2 and Tables 8 and 9), which is supported by 93.46% BUSCO completeness (Supplementary Table 10). However, this is equivalent to 52.6–60.5% of the gene numbers in octopuses and squids912. Consistently, Computational Analysis of (gene) Family Evolution (CAFE) analysis reveals a huge contraction of orthologous gene families in the N. pompilius genome by the observation of 204 contracted and 9 expanded gene families (Fig. 1c and Supplementary Table 11). Our results also support extensive gene duplications or expansions occurring during coleoid evolution and divergence. Notably, massive expansions of zinc-finger transcription factors and protocadherins, which have previously been noted in the octopus genome with functional implications for neurogenesis and adaptive innovations in the nervous system9,19, were not overrepresented in the N. pompilius genome (Extended Data Fig. 3). Most strikingly, 18 centromere protein B (CENPB) domain-containing genes were identified and the lineages were specifically expanded in the N. pompilius genome (Extended Data Fig. 3). Accumulating evidence has shown that CENPB plays crucial roles in host genome integrity and replication fidelity through the repression of retrotransposons and centromere formation in yeast or humans20,21. Therefore, CENPB expansion may serve as a possible host genome surveillance machinery for maintaining integrity of the ancient genome.

Extended Data Fig. 2. Comparison of gene repertoires in metazoans genomes.

Extended Data Fig. 2

‘One-copy’ indicates single-copy genes. ‘Multi-copy’ indicates orthologous genes present in multiple copies in all taxa. ‘Other gene’ refers to other orthologues that are present in at least one genome. Both ‘Unigene’ and ‘Uncluster’ indicate genes that have not found orthologue in each genome, where ‘Unigene’ contains at least two paralogues. ‘Uncluster’ only contains a single copy.

Extended Data Fig. 3. Heatmap on specifically expanded gene families in the N. pompilius genome.

Extended Data Fig. 3

A number of expanded gene families were found, based on domain analysis in the N. pompilius genome. In particular, 18 of the centromere protein B (CENP-B) domain (PF04218.12) containing genes were identified in the N. pompilius genome, which makes N. pompilius the species with the most CENP-B containing genes in metazoans by far. Also, lineage specific expansion of zinc-finger domains and Cadherin are also observed in the coleoids.

Phylogenetic analysis and population size estimation

To explore the timing and mode of cephalopod evolution, phylogenetic relationships were constructed for 423 single-copy orthologues from 16 animal genomes with OrthoMCL (Fig. 1c). Our phylogenetic results confirm that nautilus is a sister group to coleoids22 and their divergence is estimated at around the Silurian–Devonian boundary (422.6 million years ago (Ma)), which is congruent with unequivocal evidence for haemocyanin molecular clock inference (415 Ma) and extensive Nautilus fossil records dating back to the early Devonian23,24. It was previously hypothesized that diversity of modern coleoid cephalopods emerged during a period of Mesozoic marine revolution25. Our results support this assumption in the light of findings on coleoid divergence at the early Triassic (236 Ma), the period after Permian–Triassic extinction25. Moreover, our phylogenetic inference further revealed that divergence and speciation of ancient molluscs initiated in the Ediacaran period, during which progressive diversification and biological novelty emerged in the early metazoans26.

To better appreciate the dynamic changes in ancestral population sizes of N. pompilius and other cephalopods, we assessed the dynamic effective population size (Ne) by employing the pairwise sequential Markovian coalescent (PSMC) method (Fig. 1d). From a perspective of demographic history, profound effects on shaping the N. pompilius population are discernible in two crucial environmental evolution events during the last few million years. In particular, N. pompilius populations expanded in a stepwise manner at the turn of the Miocene (22.6 Ma). Nevertheless, their ascent came to a halt at the early phase of the Mid-Pleistocene Transition, which is consistent with fundamental climate changes, such as prolongation of glacial cycles prevailing during the period27. Most strikingly, a precipitous fall in N. pompilius populations occurred at 0.38 Ma, which is close to the onset of the Mid-Brunhes Event (MBE) around 0.4 Ma28. The MBE is considered a critical period marked by intensified amplitudes of glacial cycles, wherein variations in ice core temperature and atmospheric CO2 concentrations abruptly increased29,30. Thus, decimation of the N. pompilius population suggests an intrinsic susceptibility to extreme environmental fluctuations. However, we observed that MBE is also a turning point for population expansion of some coleoid species like Euprymna scolopes and Octopus vulgaris, reflecting the subtle effects of MBE on shaping the demographic composition of cephalopods. Additionally, the effective population size of several bony fishes with a sympatric distribution with nautilus also expanded during the MBE31,32, strongly suggesting that ecological competition was likely a pivotal driver of demographic changes in N. pompilius.

Homeobox gene cluster analysis

Given that homeobox (Hox) genes arose as key transcription factors essential to body patterning and tissue segmentation during metazoan evolution33,34, it is of great interest to explore the genetic basis for body plan evolution in cephalopods by comparing the organization of Hox clusters in multiple lineages. Previous studies have suggested that Lophotrochozoa (molluscan) ancestors preserved intact Hox clusters35,36. In this study, our results show that the N. pompilius genome contains a complete set of molluscan Hox genes (Fig. 2). Moreover, messenger RNA abundance analysis of Hox members reveals a tissue-specific expression patterns in N. pompilius (Supplementary Fig. 6). One prominent innovation in coleoids is the loss of an external shell, which has been internalized as a buoyancy compensation apparatus37. Consequently, such innovations enabled coleoids to free themselves from a ponderous external shell and drove their remarkable diversification4. Correspondingly, Hox2 in E. scolopes and Hox2Hox4 in Octopus bimaculoides are missing (Fig. 2). In parallel, the California sea hare Aplysia californica, one of the gastropod species without an external shell, also lost Hox2, Hox4 and Antp independently (Fig. 2), suggesting that the disruption of Hox cluster integrity may be linked to the evolutionary loss of an external shell in molluscan lineages. Consistent with this view, changes in spatio-temporal collinearity and dorsoventral decoupling of Hox gene expression contributed notably to evolutionary diversity in molluscan lineages35,38.

Fig. 2. Schematic representation of Hox gene clusters in metazoan genomes.

Fig. 2

Comparison of chromosomal organization of Hox gene clusters of N. pompilius with other animals. Different Hox genes are labelled with coloured boxes. Double slashes indicate that the scaffold of the Hox cluster is non-contiguous or interrupted.

Evolution of the pinhole eye

The pinhole eye is one of the most peculiar and remarkable feature of nautilus, where an adjustable pupil instead of lens creates a relatively dim image on the retina. Vertical sections of the N. pompilius pinhole eye reveal that its retina contains a single layer of rhabdomeric photoreceptor cells (Extended Data Fig. 4), which is a visual sensor universally distributed in invertebrates including coleoid cephalopods39,40. Compared to the sophisticated camera eyes in coleoids, the relative structural simplicity of the pinhole eye highlights an excellent model for reconstructing ancient evolutionary scenarios narrating the genesis of the eye and/or lens formation. It has been postulated that changes in the ‘core regulatory complex’ of transcription factors are essential for driving the evolution of functionally specific cells or organs41,42. Our genomic searches for the core regulatory transcription factors governing lens formation reveal that nearly all these core regulators including PAX6, SIX3/6 and SOX2 are present in the nautilus genome (Fig. 3a). Previously, palaeontological studies reported that fossil eyes with lenses emerged during the early Cambrian, thus supporting the ancient origin of the lens43. Exceptionally, our comparative results indicate a lineage-specific loss of the Nrl/Maf (large Maf) gene in the N. pompilius genome (Fig. 3a and Supplementary Table 12). Phylogenetic analysis shows that molluscan Nrl/MafaMafc belong to the large Maf superfamily and their orthologues diverge into four clades (Mafa, Mafb, c-Maf and Nrl) in vertebrates (Fig. 3b and Supplementary Figs. 7 and 8). Experimental evidence further supports the notion that members of the large Maf family are lens-specific in expression and play a central role in lens induction and differentiation in vertebrates44,45. Moreover, recruitment of Nrl or c-Maf can augment PAX6-induced crystallins, which are the most abundant lens structural proteins required for light refraction and transparency46. As expected, ten crystallin-like genes are identified in the N. pompilius genome and are conspicuously contracted compared to other lens-equipped molluscs (Fig. 3a). In particular, the phylogenetic tree further reveals that lineage-specific expansion of S-crystallin is found in coleoids and none of the S-crystallin genes is encoded in the N. pompilius genome (Fig. 3c and Supplementary Figs. 911), in agreement with their roles as major constitutive lens proteins in cephalopods47. Furthermore, investigation of transcriptional regulatory sites on crystallin proximal upstream sequences reveals that enrichment of NRL/MAF binding motif is distributed more abundantly in coleoids than in N. pompilius (Supplementary Fig. 12), underscoring the fact that independent gene losses in nautilus and expansion of crystallins in coleoids may be instrumental in driving eye evolution in cephalopods. However, a previous transcriptomic study reported lineage-specific loss of SIX3/6 expression in the N. pompilius48embryo, raising the possibility that alternation in core regulatory transcription factor expression may lead to evolutionary divergence of the eye.

Extended Data Fig. 4. Histological analysis of the pinhole eye in N. pompilus.

Extended Data Fig. 4

Histological features of the pinhole eye was examined in tissue sections after hematoxylin and eosin (HE) staining. Full view (panel a) and partial enlargement (panels b and c) show the photoreceptor and ganglion cells in a single optical layer.

Fig. 3. Loss of NRL and contraction of crystalline genes are linked to the evolution of the pinhole eye.

Fig. 3

a, Distribution of core transcription factors crucial for regulating lens development and key optic gene families in multiple metazoans; the ‘+’ and ‘−’ symbols indicate eyes with and without lenses, respectively. b, Phylogenetic analysis of NRL/MAF superfamily from representative metazoans. The phylogenetic tree was constructed using MrBayes under a mixed model of amino acid substitution. The degree of support for internal branching is shown as a probability percentage at the base of each node. Notably, the large MAF only preserves one copy in molluscs but diverges into four clades in vertebrates. N. pompilius is the only extant species that has lost NRL. c, Phylogenetic analysis of crystallin superfamily from representative metazoans. Coleoid cephalopods, N. pompilius and non-cephalopod metazoans are indicated by the black, red and grey branches, respectively. For detailed results, see Supplementary Fig. 10.

As a nocturnal predator, nautilus has evolved the characteristic behaviour of vertical depth migration into shallower waters at night49,50. Understandably, light sensing and spatial vision are fundamental prerequisites for achieving this task. Phylogenetic evidence shows that the N. pompilius genome encodes one photoreceptive r-opsin gene and one retinochrome gene, representing the minimal opsin gene number among known metazoans (Fig. 3a and Extended Data Fig. 5). Moreover, expression pattern analysis reveals that r-opsin and its associated signalling cascades are predominantly expressed in the eye (Fig. 4), suggesting that the principal role of r-opsin lies in mediating rhabdomeric phototransduction in N. pompilius51,52. With a fair degree of certainty, monotonic r-opsin does not support colour discrimination in N. pompilius, suggesting colour blindness in nautilus as described in most cephalopods53.

Extended Data Fig. 5. Phylogenetic tree of the opsin gene family.

Extended Data Fig. 5

Phylogenetic tree was constructed by MrBayes method as described above. The melatonin receptor clade was set as an outgroup. Based on the topological structure, the ancestor of opsin divided into different clades: r-opsin (Annelid r-opsin, Mollusc r-opsin, melanopsin, and canonical r-opsin)/C-opsin/Go-opsin (Xenopsin, Nerropsin, Go opsin, and Retinochrome) clade. One r-opsin (EVMG007539) and one retinochrome (EVMG008353) were identified in the N. pompilus genome and marked in red.

Fig. 4. Visual model of N. pompilius.

Fig. 4

Key components of visual retinoid cycles and opsin signalling cascades have been identified in the N. pompilius genome. The heatmap of visual cycles and opsin signalling cascades indicates specific expression patterns in the eye116,117. PKC, protein kinase C; PLC, phosphoinositide-specific phospholipase C.

In contrast, perception of light intensity is much more critical for vertically migrating marine animals due to the dramatic decline of luminance in deep-sea waters54. Opsin sensitivity to light largely depends on the chromophore of 11-cis retinal, isomerization of which typically results in conformational changes and activation of opsin signalling transduction55. Thus, efficient regeneration of 11-cis retinal is necessary to maintain visual function56. In cephalopods, the retinochrome is a major and lineage-specific isomerase in the visual cycle57, confirmed by the identification of a retinochrome-encoded gene in the N. pompilius genome (Extended Data Fig. 5). Moreover, in vertebrates, retinal pigment epithelium-specific protein 65 kDa (RPE65) is a key isomerase in driving the visual retinoid cycle through converting all-trans retinyl ester to 11-cis retinol58,59. Intriguingly, an expansion of the RPE65 gene family, which encodes a total of ten genes, was found and identified in the N. pompilius genome (Supplementary Fig. 13). In silico molecular simulation revealed that nautilus RPE65 shares a conserved iron ion-binding site, an active site cavity and a hydrophobic tunnel for substrate entry with human RPE65, thus suggesting potential catalytic activity (Supplementary Fig. 14 and Extended Data Fig. 6). Unlike restricted expression of RPE65 in pigment epithelium in vertebrates, broad expression of RPE65 across tissues including the eye was observed in N. pompilius in this study (Supplementary Figs. 15 and 16), which may be explained by the fact that the molluscan (including in nautilus) retina lacks an anatomical architecture similar to the pigment epithelium. From a perspective of evolutionary adaptation, the appearance of the pinhole eye is one adaptive breakthrough essential to the nautilus lifestyle of vertical depth migrations, allowing the organism to acquire spatial vision and rapidly cope with hydrostatic pressure within the eye through opening the pupil to seawater. Overall, multiple genomic innovations including gene losses, independent contraction and expansion of specific gene families and presence of associated regulatory networks seem to work in unison to drive the evolution of the pinhole eye in nautilus.

Extended Data Fig. 6. Modeling and docking of RPE65 and all-trans retinyl ester in N. pompilius and H. sapiens.

Extended Data Fig. 6

Structure model of H. sapiens RPE65 (a) and N. pompilius RPE65 (b) with all-trans retinyl ester, which located near the active site defined by the iron ion. The ion cofactor is found near the top face of the propeller axis and is conserved in H. sapiens and N. pompilius, which is directly coordinated by four His residues (His180, His241, His313, His527 in H. sapiens; His169, His229, His301, His507 in N. pompilius), with average bond length of 2.16 Å in H. sapiens, and 2.34 Å in N. pompilius. Ferrous iron is required for its catalytic activity, binding to the hydroxyl oxygen to catalyze the isomerization reaction. The docking site details were displayed, revealing that a shorter average bond length (2.95 Å) between atRE and ion cofactor in N. pompilius (Fig f), than that (4.4 Å) in H. sapiens (Fig c), suggesting the catalytic potential of N. pompilius RPE65. The hydropholic tunnel of N. pompilius RPE65, leads from the protein surface to active site, the mouth of which is surrounded by three groups of residues (185–190, 222–224, and 249–259, Fig g), highly conserved with that in H. sapiens RPE65 (196–202, 234–236, and 261–271, Fig d). On the other hand, the N. pompilius RPE65 also shows a distinguishable character: the iron cofactor, ordinated by four His residues, three second shell Glu residues and a Val residue, displays a more loose structure (Fig h) than that in H. sapiens RPE65 (Fig e), which shows no obvious interference to its catalytic activity.

Pearl shell formation

As the only extant cephalopod with an exoskeleton, nautilus possesses an intricate shell of spiralling chambers that not only acts as a protective physical shield against predation or environmental adversities but also plays an indispensable role in buoyancy maintenance. Thus, the unique shell architecture of nautilus results from adaptive evolution for vertical migration. Generally, molluscan shell formation is one of fundamental biomineralization processes where shell matrix proteins (SMPs) guide the growth of calcium carbonate polymorphs (calcite and/or aragonite) and organization of crystal into intricate shell formation60. Clearly, understanding the ultrastructural architecture and SMP biocomposition of the N. pompilius shell is important for uncovering the ancient mechanisms underlying shell formation and its evolution. Previous studies have assumed that the composition of aragonite crystals underpins superior strength and toughness for resisting high hydrostatic pressures in N. pompilius3,61. Our scanning electron microscopy (SEM) images of the N. pompilius inner layers confirm this and reveal pure aggregates of hexagonal aragonites that stack up along the direction of growth (Fig. 5a). Thus, our results lend support to the hypothesis that aragonite may be ancient crystalline calcium carbonate before calcite became the staple building blocks for the construction of the molluscan shell62. To further investigate the molecular basis of nautilus shell formation, a total of 78 SMPs were identified from acid-soluble (ASM) or acid-insoluble (AIM) matrix fractions derived from 2 technical replicates (Fig. 5b and Supplementary Table 13). Expression patterns showed that most of these SMPs (72.2%) were expressed especially highly in the mantle (Extended Data Fig. 7), thereby confirming a central role of the mantle in shell formation as suggested previously in molluscan species63,64.

Fig. 5. Ultrastructure and proteome of the N. pompilius shell.

Fig. 5

a, SEM images representative of the ultrastructure of the nacre layer of the N. pompilius shell. b, Number of proteins identified from the AIM and ASM fractions. c, Circos diagram showing similarities between five representative molluscan shell proteomes (the E-value cut-off of protein–protein BLAST is 1 × 10−5). Proteins sharing similarities between N. pompilius and other species are linked by different coloured lines, with the top quartile as the purple line, the second quartile as the blue line, the third quartile as the orange line and the lowest quartile as the grey line. The percentages and proportions in brackets represent the number of proteins having similarities between N. pompilius and four reference species. d, UpSet plot comparing orthologous groups and species-specific genes among five species. The red dot indicates conserved domains among the five species. e, Shell protein expression levels in nine tissues. The inset shows the top 10 mantle-enriched SMPs in N. pompilius containing new repetitive poly (Gly or Gly-Ala) motifs in de novo prediction.

Extended Data Fig. 7. Specific expression of SMPs in the mantle of N. pompilius.

Extended Data Fig. 7

Heatmap shows the normalized expression profiles of shell proteins in different tissues, indicating that majority of SMPs are expressed specifically and in high abundance in the mantle. Nautilus specific shell protein genes were also marked with green color in the colored bar on the right.

To characterize the conserved molluscan biomineralization ‘toolkit’, we performed comparative shell proteomic analysis, which showed that 21 of N. pompilius SMPs shared similarity with counterparts in other molluscs including bivalves and gastropods (Fig. 5c). Further domain analysis revealed several conserved domains across molluscs, which contained the Sushi/SCR/CCP, laminin, chitin-binding and carbonic anhydrase domains (Extended Data Fig. 8). This evidence points to the possibility that these domains occur as an ancient ‘core biomineralization toolkit’ and are conserved across multiple molluscan lineages with an external shell65,66. OrthoFinder analysis showed that 52 of 78 SMPs afforded new or N. pompilius-specific shell proteins (Fig. 5d), leading us to speculate that most of the unique SMPs evolved independently and contribute to a high degree of diversity in shell architecture in molluscs. This is also supported by evidence for low similarity of the key SMP, Nautilin-63, even within the same Nautilus genus (Supplementary Fig. 17)67. Strikingly enough, we found that the top 10 mantle-enriched SMPs in N. pompilius do not match any known Pfam domains but contain new repetitive poly (Gly or Gly-Ala) motifs through de novo predictions (Fig. 5e). Therefore, the preponderance of these SMPs may be associated with the uniqueness and new features of the nautilus shell structure, further bolstering our previous assumption. Interestingly, several repetitive low-complexity domains (RLCDs) involved in aggregation or binding have been extensively identified in shell structure proteins in multiple nacre-producing bivalve and gastropod lineages68,69, strongly suggesting that parallel evolution of RLCDs could be a unifying principle for molluscan biomineralizaiton, especially for nacre formation.

Extended Data Fig. 8. Conserved molluscan biomineralization “toolkit” among five molluscan species.

Extended Data Fig. 8

The conserved domains of shell matrix proteins contain Sushi/SCR/CCP domain, laminin domain, chitin binding domain and carbonic anhydrase domain. Domain architecture was predicted and constructed by the software SMART.

Immune system

To appreciate the biology of N. pompilius, understanding the molecular mechanisms of their immune defence is especially revealing to delineate the ancient evolutionary features of innate immunity in cephalopod ancestors. Whole-genome annotation reveals that nautilus has highly complex yet comprehensive innate immune components. In particular, Toll-like receptor (TLR) signalling and tumour necrosis factor receptor (TNFR) signalling, as the central regulators that mediate key immune responses including apoptosis, inflammation and immune defences70,71, are found in nautilus (Fig. 6a), suggesting an ancient origin and co-option of innate defence ‘toolkit’ genes in cephalopod ancestors. Moreover, several genes including IL17R, H-lectin and IL1, were specifically identified in the nautilus genome (Fig. 6b), which supports the assumption that nautilus has preserved a more complete repertoire of immune molecules than other cephalopods. Since massive duplication or expansion of key immune genes is a fundamental approach to boosting host defence72, we analysed the gene number of immune defence-related genes and compared distinct lineage-specific gene family expansions in nautilus and coleoids (Fig. 6b). Quite strikingly, the nautilus genome encodes a total of 81 C-type lectin genes, which is significantly expanded with regard to the 12–33 genes found in coleoids (Fig. 6b). Phylogenetic analysis further revealed that several lineage-specific lectin genes are independently duplicated in N. pompilius (Fig. 6c). In animals, lectins are versatile immune molecules indispensable for discrimination, neutralization, agglutination and destruction of pathogens via specific binding of unique carbohydrate moieties on the surface of bacteria73. Hence, we reason that massive expansion of lectins may have resulted in the creation of remarkable inherent diversity that is conducive to containing different pathogens emerging from dynamic environments. IFN-inducible GTPases (IIGPs), another important class of innate effectors demonstrated to play critical roles in vesicle trafficking and antimicrobial inflammasome assembly74,75, are also specifically expanded in the nautilus genome (Fig. 6b and Supplementary Fig. 18). Thus, an integrated, highly complex and complete innate immune system coupled to linage-specific gene expansions in nautilus contribute to the establishment of sophisticated host responses against a diverse spectrum of invading pathogens during the organism’s evolutionary history. However, we also observed that interleukin-17 (IL-17) is specifically expanded in the octopod lineage (Fig. 6b and Supplementary Fig. 19), suggesting that distinct defence mechanisms have evolved in different cephalopod linages.

Fig. 6. Functionally complete and specific gene expansion in the N. pompilius immune system.

Fig. 6

a, Schematic representation of molecular components in the TLR and TNFR signalling pathways. AP-1, activator protein 1; IKK, inhibitor of nuclear factor kappa-B kinase; IRAK, interleukin-1 receptor-associated kinase; JNK, c-Jun NH2-terminal kinase; MKK, mitogen-activated protein kinase kinase; Myd88, myeloid differentiation primary response 88; NF-κB, nuclear factor kappa-B; NOS, nitric oxide synthase; TAK1, transforming growth factor-β-activated kinase 1; TRAF6, TNFR-associated factor 6; TRAK, trafficking kinesin-binding protein. b, Distribution of TLR and TNFR signalling pathway components in representative cephalopod species. Gene numbers are represented by spheres of different sizes and colours. c, Phylogeny of C-type lectin in cephalopod species. The different colours in the circle represent distinct species. N. pompilius-specific expanded clades are labelled in light red.

Discussion

Genomic evidence reveals that nautilus has undergone lineage-specific innovations in both body plan and behaviour since the Cambrian and retained these extraordinary features after a long evolutionary history. In particular, vertical depth migration in Nautilus and other chambered cephalopods is one of several critical and common strategies needed to avoid predators and budget energy; these may have helped the survival of these species ever since. The emergence of the pinhole eye is a great innovation for switching from directional to spatial vision and rapidly change hydrostatic pressure, making vertical depth migration possible. Our findings highlight that co-evolutionary loss of core regulatory transcription factors may have driven the evolution of the pinhole eye. Moreover, our proteomic and transcriptomic data suggest that an ancient ‘core biomineralization toolkit’ and new RLCDs co-ordinately directed the construction of the chamber shell, which has evolved into the buoyancy apparatus needed to adapt to a critical life mode. Taken together, the draft genome of N. pompilius together with multi-omics provide a valuable insight into not only the adaptive innovations of the ancestor of cephalopods but also the dynamic evolution of coleoids.

Methods

Sample collection and research ethics

A sample of N. pompilius was originally obtained via a biological resources reconnaissance survey in October 2016, during which a single adolescent individual of N. pompilius with a body size of 12 cm was collected near the Nansha Islands of the South China Sea (7° 62′ 7514′′ N, 112° 26′ 4571′′ E). The adolescent nautilus was then maintained in a dark tank at 16–19 °C while being transported. The organism was subsequently donated by the Chinese Ocean Conservation Association for research use in this study in accordance with local research guidelines and regulations on animal experimentation. All experimental protocols were reviewed and approved by the research ethics committee for animal experiments at the South China Sea Institute of Oceanology, Chinese Academy of Sciences. Nautilus muscle was used to extract DNA with a DNeasy Blood & Tissue Kit (QIAGEN). Multiple tissue samples including the mantle, eye, tentacle, funnel, gill, beak, muscle and liver were used for RNA extraction with the TRIzol reagent (Thermo Fisher Scientific); the quantity and quality of DNA were checked by agarose gel electrophoresis using a Qubit 2.0 fluorometer (Thermo Fisher Scientific), respectively.

Illumina sequencing and genome size estimation

The 270-base pair (bp) paired-end libraries were constructed using Illumina’s paired-end kits according to the manufacturer’s instructions. The libraries were sequenced on an Illumina HiSeq 2500 platform. For the raw reads, sequencing adaptors were removed. Contaminated reads containing chloroplast, mitochondrial, bacterial or viral sequences were screened via alignment to the National Center for Biotechnology Information (NCBI) NR database using the Burrows–Wheeler Aligner (BWA) v.0.7.13 (ref. 76) with default parameters. FastUniq v.1.1 (ref. 77) was used to remove duplicated read pairs. Low-quality reads were filtered out on the basis of the following conditions: (1) reads with ≥10% unidentified nucleotides; (2) reads with >10 nucleotides aligned to an adaptor, allowing ≤10% mismatches; and (3) reads with >50% bases having Phred quality <5. About 59.78 gigabases (81.83×) corrected Illumina reads were selected to perform genome size estimation. N. pompilius genome size was estimated using the formula: genome size = k-mer_number/peak_depth.

PacBio sequencing

Genomic DNA was sheared by means of a g-TUBE device (Covaris) with 20-kilobase (kb) settings. Sheared DNA was purified and concentrated with AMPure XP Beads (Agencourt) for further use in single-molecule real-time (SMRT) bell preparation according to the manufacturer’s protocol (Pacific Biosciences). The 20-kb template preparation was done by BluePippin size selection (Sage Science). Size-selected and isolated SMRT bell fractions were purified with AMPure XP Beads. Finally, these purified SMRT bells were used for primer and polymerase (P6) binding according to the manufacturer’s binding calculator (Pacific Biosciences). Single-molecule sequencing was done on a PacBio RS II platform with C4 chemistry. Only PacBio subreads equal to or longer than 500 bp were used to perform N. pompilius genome assembly.

Genome assembly

Canu, LoRDEC and wtdbg

We used the error correction module of Canu v.1.5 (ref. 78) to select for longer subreads with the settings genomeSize = 753,000,000 and corOutCoverage = 109, detect raw subreads overlapping through a highly sensitive overlapped MHAP v.2.12 (corMhapSensitivity = normal) and complete error correction by the falcon_sense method (correctedErrorRate = 0.025). Then, the output subreads of Canu were further corrected using LoRDEC v.0.6 (ref. 79) with the parameters -k 19 -s 3 by using Illumina paired-end reads. Based on these two rounds of error-corrected subreads, we generated a draft assembly with wtdbg v.1.1.006 (https://github.com/ruanjue/wtdbg) with the parameters -t 64 -H -k 21 -S 1.02 -e 3.

Sparse, DBG2LOC and Canu

Trimmed Illumina 270-bp paired-end reads were assembled as contigs using the Sparse software (https://github.com/yechengxi/SparseAssembler)80 with default parameters. The DBG2LOC (https://github.com/yechengxi/DBG2OLC) software with the parameters KmerCovTh 2 MinOverlap 55 AdaptiveTh 0.008 k 17 RemoveChimera 1 was used to assemble the genome and combine the paired-end read assembled contigs. PacBio subreads were corrected using Canu v.1.5 as described above. The split_and_run_sparc.sh shell, created with the Sparc module and blasr software v.1.3.1 (ref. 81), was used to output the consensus assembly.

Quickmerge

The output assembly of Sparse, DBG2LOC and Canu, as a query input, was aligned against the assembly of Canu, LoRDEC and wtdbg with MUMmer v.4.0.0 (https://github.com/mummer4/mummer) with the nucmer parameters -b 500 -c 100 -l 200 -t 12 and the delta-filter parameters -I 90 -r -q and then merged using quickmerge82 with the parameters -hco 5.0 -c 1.5 -l 100000 -ml 5000. Finally, iterative polishing by Pilon v.1.22 (ref. 83) was achieved by aligning adaptor-trimmed paired-end Illumina reads to the draft assembly with the parameters --mindepth 10--changes--threads 4--fix bases.

Evaluation of genome assembly

To evaluate genome quality, we first mapped Illumina reads onto the N. pompilius assembly with the BWA. Next, genome completeness was verified by mapping 248 highly conserved eukaryotic genes and 908 metazoan benchmarking universal single-copy orthologues to the genome by using BUSCO v.3.0.2b (ref. 84).

Genome annotation

TE analysis was performed by building a repeat library with the prediction programs LTR_FINDER v.1.05 (ref. 85), MITE-Hunter v.1.0.0 (ref. 86), RepeatScout v.1.0.6 (ref. 87) and PILER-DF v.1.0 (ref. 88). The database was classified using PASTEClassifier v.1.0 (ref. 18) and combined with the Repbase database v.19.06 (ref. 89). TE sequences in the N. pompilius genome were identified and classified using RepeatMasker v.2.3 (ref. 90). TE divergence analysis was made by using a detailed annotation table from the output of RepeatMasker v.2.3 (ref. 90). By using the percentage of discrepancy between matching regions and consensus sequences in the database, we analysed the number of TEs with a certain divergence rate and built a repeat landscape using an R script that was modified from https://github.com/ValentinaBoP/TransposableElements.

Protein-coding genes were predicted based on EVM v.1.1.1 (ref. 91) by integrating homologue, RNA sequencing (RNA-seq) and de novo gene prediction methods. Homologue prediction was performed based on homologous peptides from Crassostrea gigas, Crassostrea virginica, L. gigantea and Danio rerio with GeMoMa v.1.3.1 (ref. 92). RNA-seq-based gene prediction was performed by mapping clean RNA-seq reads to the genome using Hisat v.2.0.4 and assembled by StringTie v.1.2.3. Multiple methods including PASA v.2.0.2, TransDecoder v.2.0 and GeneMarkS-T v.5.1 were applied to predict coding regions. GENSCAN v.20030218 (ref. 93), AUGUSTUS v.2.4 (ref. 94), GlimmerHMM v.3.0.4 (ref. 91), GeneID v.1.4 (ref. 95) and SNAP v.2006–07–28 (ref. 96) were used for de novo gene prediction with default parameters. UniGenes were assembled by Trinity v.Trinityrnaseq_r20131110 (ref. 97) and were then inputted to PASA v.2.0.2 (ref. 98) to predict genes. Training models used in AUGUSTUS, Glimmer HMM and SNAP were obtained from the prediction results of PASA v.2.0.2 and GeMoMa v.1.3.1. Gene models from these different approaches were combined by EVM v.1.1.1.

The predicted genes were annotated by blasting their sequences against a number of nucleotide and protein sequence databases, including COG Release 201703 (ref. 99), KEGG Release 20170310 (ref. 100), NCBI NR Release 2016_7_19 and SWISS-PROT Release 2015_01 (ref. 101) with an E-value cut-off of 1 × 10−5. Moreover, these predicted genes were annotated against the Pfam database of the HMMER v.3.1b2 software (http://www.hmmer.org) and the InterPro database of InterProScan v.5.34-73.0 (https://github.com/ebi-pf-team/interproscan). Gene Ontology for each gene was assigned by Blast2GO v.2.5 (ref. 102) based on NCBI databases.

Phylogenetic analysis, gene expansion and contraction

Protein sequences of Branchiostoma floridae (GCF_000003815.1), L. gigantea (GCF_000327385.1), A. californica (GCF_000002075.1), Tribolium castaneum (GCF_000002335.3), C. gigas (GCF_000297895.1), Helobdella robusta (GCF_000326865.1), Capitella teleta (GCA_000328365.1), Chlamys farreri (CfBase), Nematostella vectensis (GCF_000209225.1), E. scolopes (GCA_004765925.1), O. bimaculoides (GCF_001194135.1), Octopus minor (GigaDB), O. vulgaris (CephRes-gdatabase), Drosophila melanogaster (FlyBase), Homo sapiens (hg38) and N. pompilius comprising 388,531 protein sequences were clustered into 40,231 orthologue groups using OrthoMCL v.3.1 (ref. 103) based on an all-versus-all BLASTP strategy with an E-value of 1 × 10−5 and a Markov chain clustering default inflation parameter of 1.5. To construct phylogenetic relationships, 423 single-copy orthologues were extracted from all 16 species and multiple alignment analysis was performed with MUSCLE v.3.8.31 (ref. 104). All alignments were combined into one supergene and a phylogenetic tree was analysed with RAxML v.8.2.12 (ref. 105) with 1,000 rapid bootstrap analyses, followed by searching for a best-scoring maximum likelihood tree in 1 single run. Finally, divergence time was estimated using MCMCTree from the PAML package v.4.7a (ref. 106) in combination with a molecular clock model. Several reference-calibrated time points referring to the TimeTree database (http://timetree.org/) (Supplementary Table 14). Homologue clusters with >100 gene copies in 1 or more species were separated from the OrthoMCL results. Expansion and contraction of the reserved homologue clusters were determined by CAFE v.4.2 (ref. 107) calculations with the parameters lambda -s and P < 0.01 on the basis of changes in gene family size with regard to phylogeny and species divergence time.

Evolutionary rate test

To compare the relative evolutionary rates of N. pompilius with other cephalopods, 1,223 one-to-one orthologues between 5 cephalopods species were identified with the InParanoid v.4.1 software (http://inparanoid.sbc.su.se) from 5 cephalopod species and L. gigantea. Then, these 1,223 orthologous proteins were aligned with MUSCLE v.3.8.31 and concatenated into a super alignment. Among them, L. gigantea was assigned as an out-group. Tajima’s relative rate test analysis was conducted using MEGA v.7.0.18 (ref. 108).

To compare the neutral nucleotide mutation rate for N. pompilius relative to other cephalopods, alignment of the 4D sites of 1,223 one-to-one orthologues from 5 cephalopods and 1 out-group (L. gigantea) was performed. The results were used in the topology obtained from our phylogenetic analysis as an input for RAxML v.8.2.12 (ref. 105) optimization of branch lengths in 4D alignment. Pairwise distances to L. gigantea were calculated from the neutral tree by using the cophenetic function implemented in the R package ape v.3.2.

Exon and intron evolution in cephalopod species

The 1,223 orthologous proteins of 5 cephalopod species were aligned using MUSCLE v.3.8.31. The position of introns longer than 50 nucleotides and characteristic of U2 or U12 splicing boundaries were mapped out using a customized Perl script. In addition, 3,071 discordant intron positions were identified based on previous methods109, the distributions of which were determined based on their phylogenetic relationship. Intron gains and losses were inferred by phylogenetic distributions using parsimony.

Population size estimation

The demographic history of N. pompilius was analysed with the PSMC v.0.6.5 software110. The synonymous mutation rate per base per year was inferred based on the formula T = ks/(2λ). The generation time was assumed to be 15 years in N. pompilius and 3 months to 1 year in other cephalopods (Supplementary Table 15).

Hox gene analysis

The structure of Hox genes in the N. pompilius genome was analysed with GeMoMa v.1.4.2 (ref. 111) using default parameters and based on available Hox gene models. Predictions were made by applying a GeMoMa annotation filter with default parameters, with the exception of the evidence percentage filter (e = 0.1). These were then manually verified to achieve a single high-confidence transcript prediction per locus. The exact annotations of each Hox gene were completed using phylogenetic relationships.

Analysis of eye development genes

Key transcription factors and genes for eye development in the human genome were used as queries to identify their orthologues in other lineages. For lineage-specific gene families, such as S-crystallin, queries were set as homologues in the genome of O. bimaculoides. First, homologous searches in the gene set were performed using BLASTP with an E-value of 1 × 10−5. Then, the identified candidates were aligned back to the human gene set; only orthologues with the best BLASTP hit matches were defined as orthologues in each species. Additionally, TBLASTN was used to avoid any omissions in genome annotation. The accession numbers of these protein sequences are listed in Supplementary Table 12.

Transcriptomic analysis

Total RNA was isolated from different tissues of N. pompilius and treated with RNase-free DNase I (Promega Corporation), according to the manufacturer’s protocol. The quality and integrity of RNA were checked using an Agilent 2100 Bioanalyzer. Illumina RNA-seq libraries were prepared and sequenced on a HiSeq 2500 system with a PE150 strategy, according to the manufacturer’s instructions (Illumina). After trimming based on quality scores using Btrim v.0.2.0, clean reads were aligned to the N. pompilius genome with TopHat v.2.1.1 (ref. 112). Gene abundance in different tissues was calculated using Cufflinks v.2.1.1 (ref. 113).

SEM

To characterize crystal structures, precleaned N. pompilius shells were fractured and carefully collected with a dissecting knife. Pieces of fractured ligaments were dried with liquid nitrogen at a critical point followed by platinum coating using a sputter coater. Then, the shell surface was examined by SEM (S-3400N; Hitachi) with an accelerating voltage of 30 kV in high vacuum mode.

Isolation of shell proteomics

SMPs were extracted from N. pompilius shells according to a protocol described previously with minor modifications114. First, shells were processed using abrasive paper to remove organic contaminants on the surface and washed with Milli-Q three times. Then, shells were immersed in 5% NaClO for 24 h under 4 °C with gentle shaking, washed three times with Milli-Q and air-dried at room temperature. Shells were ground into a powder and sieved by means of a nylon mesh (200 μm). Afterwards, the shell powder was bleached using 10% NaClO for 5 h. The mixture was then centrifuged at 3,000 r.p.m. for 10 min at 4 °C to remove the supernatant, washed twice and freeze-dried. The precleaned shell powder was titrated using 10% acetic acid at 4 °C with gentle shaking until all calcified constituents were completely dissolved. The powder solution was centrifuged again at 1,000 r.p.m. for 10 min at 4 °C to yield supernatant (an ASM) and precipitate (an AIM) fractions. The AIM fraction was further washed twice in Milli-Q, lyophilized and reconstituted with 8 M of urea (with 2% SDS). Both AIM and ASM were concentrated using an Amicon Ultra 3 K centrifugal filter, purified with methanol/chloroform and further reconstituted in 8 M of urea.

Since the concentrations of AIM and ASM proteins were quite low, we adopted an in-solution digestion method. Briefly, proteins were reduced by dithiothreitol with a final concentration of 10 mM at 56 °C for 1 h. The exposed sulphhydryl groups were then alkylated by 55 mM of iodoacetamide for 30 min at room temperature. After being diluted eightfold with 50 mM of triethylammonium bicarbonate, the sample solutions were digested for 16 h at 37 °C using sequencing-grade trypsin (Promega Corporation), desalted via Sep-Pak C18 cartridges (Waters Corporation) and dried off in a vacuum concentrator. The dried samples were then reconstituted in 0.1% formic acid for analysis by a LTQ Orbitrap Elite system coupled to an EASY-nLC (Thermo Fisher Scientific), as described elsewhere115. The .mgf files converted from raw liquid chromatography–tandem mass spectrometry data files using Proteome Discovery 1.3.0.339 (Thermo Fisher Scientific) were searched against Mascot v.2.3.2 (Matrix Sciences). The database included both target and decoy sequences of the N. pompilius protein database. Proteins detected in two replicates were kept for further analysis.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Supplementary information

Supplementary Information (3.6MB, pdf)

Supplementary Figs. 1–19.

Reporting Summary (69.6KB, pdf)
Supplementary Tables (51.8KB, xlsx)

Supplementary Tables 1–15.

Acknowledgements

We thank our lab members and collaborators who have provided us with able assistance or valuable advice at all stages of this study. We acknowledge grant support from the National Key R&D Program of China (no. 2018YFC1406505 to Yang Zhang), Key Special Project for Introduced Talents Team of Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (no. GML2019ZD0407 to Yang Zhang), the Strategic Priority Research Program of the Chinese Academy of Sciences (no. XDA13020202 to Z.Y.), Science and Technology Program of Guangzhou (no. 201804020073 to Z.Y.), Institution of South China Sea Ecology and Environmental Engineering, Chinese Academy of Sciences (no. ISEE2018PY03 to Yang Zhang), Science and Technology Planning Project of Guangdong Province (no. 2017B030314052 to Z.Y.), National Science Foundation of China (no. 32073002 to Yang Zhang, no. 31902404 to F.M. and no. 31671490 to X.S.), Demonstration Project for Innovative Development of Marine Economy (no. NBHY-2017-S4 to Y.B.) and the Austrian Science Fund (no. P30686-B29 to O.S.).

Extended data

Author contributions

Yang Zhang and Z.Y. conceived the study, designed the scientific objectives and coordinated the project. Yang Zhang and F.M. led the manuscript preparation and writing. M.H. handled the materials for the genomic, transcriptomic and proteomic sequencing. Haitao Ma and Yuehuan Zhang performed the DNA extraction. L.W. and H.D. prepared the large-insert genomic libraries. F.M., L.W. and M.M. carried out the genome sequencing, assembly and annotation. N.-K.W. and H.Z. performed the gene family analysis. S.X., Z.X. and J.L. performed the mRNA extraction and library preparation for transcriptomic sequencing. L.W. and F.W. analysed the Hox gene cluster. Y.X. and Z.Z. performed the TE divergence. M.H. and X.M. conducted the mRNA expression pattern of key transcriptional factors. Yang Zhang and F.M. participated in analysing the visual model. Huawei Mu, F.M. and X.S. extracted and identified the SMPs. F.M. and L.Z. performed the shell ultrastructure analysis. F.M. and Yang Zhang conducted the immune system analysis. Y.B. performed the evolutionary and PSMC analyses. Yang Zhang, O.S., N.-K.W. and Z.Y. participated in the final data analysis and interpretation. All authors read, wrote and approved the manuscript.

Data availability

The nautilus genome project has been deposited with the NCBI under the BioProject number PRJNA614552. The whole-genome sequencing data were deposited with the sequence read archive (SRA) database under accession nos. SRR11485669SRR11485706. The RNA-seq data from various tissue transcriptomes have also been deposited with the SRA database under accession nos. SRR11485678SRR11485687. Gene annotation data have been deposited in the Genome Warehouse database of the Genome Sequence Archive (GSA) under accession no. GWHBECW00000000.

Competing interests

The authors declare no competing interests.

Footnotes

Peer review information Nature Ecology & Evolution thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Yang Zhang, Fan Mao, Huawei Mu, Minwei Huang, Yongbo Bao, Lili Wang.

Change history

9/28/2021

A Correction to this paper has been published: 10.1038/s41559-021-01571-4

Extended data

is available for this paper at 10.1038/s41559-021-01448-6.

Supplementary information

The online version contains supplementary material available at 10.1038/s41559-021-01448-6.

References

  • 1.Kröger B, Vinther J, Fuchs D. Cephalopod origin and evolution: a congruent picture emerging from fossils, development and molecules. Bioessays. 2011;33:602–613. doi: 10.1002/bies.201100001. [DOI] [PubMed] [Google Scholar]
  • 2.Teichert, C. & Matsumoto, T. in Nautilus: the Biology and Paleobiology of a Living Fossil (eds Saunders, W. B. & Landman, N. H.) 25–32 (Springer, 2010).
  • 3.Lüttge U, Souza GM. The Golden Section and beauty in nature: the perfection of symmetry and the charm of asymmetry. Prog. Biophys. Mol. Biol. 2019;146:98–103. doi: 10.1016/j.pbiomolbio.2018.12.008. [DOI] [PubMed] [Google Scholar]
  • 4.Crook RJ, Hanlon RT, Basil JA. Memory of visual and topographical features suggests spatial learning in nautilus (Nautilus pompilius L.) J. Comp. Psychol. 2009;123:264–274. doi: 10.1037/a0015921. [DOI] [PubMed] [Google Scholar]
  • 5.Crook R, Basil J. A biphasic memory curve in the chambered nautilus, Nautilus pompilius L. (Cephalopoda: Nautiloidea) J. Exp. Biol. 2008;211:1992–1998. doi: 10.1242/jeb.018531. [DOI] [PubMed] [Google Scholar]
  • 6.Reiter S, et al. Elucidating the control and development of skin patterning in cuttlefish. Nature. 2018;562:361–366. doi: 10.1038/s41586-018-0591-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Schnell AK, Clayton NS. Cephalopod cognition. Curr. Biol. 2019;29:R726–R732. doi: 10.1016/j.cub.2019.06.049. [DOI] [PubMed] [Google Scholar]
  • 8.Grasso FW, Basil JA. The evolution of flexible behavioral repertoires in cephalopod molluscs. Brain Behav. Evol. 2009;74:231–245. doi: 10.1159/000258669. [DOI] [PubMed] [Google Scholar]
  • 9.Albertin CB, et al. The octopus genome and the evolution of cephalopod neural and morphological novelties. Nature. 2015;524:220–224. doi: 10.1038/nature14668. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Kim B-M, et al. The genome of common long-arm octopus Octopus minor. Gigascience. 2018;7:giy119. doi: 10.1093/gigascience/giy119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Belcaid M, et al. Symbiotic organs shaped by distinct modes of genome evolution in cephalopods. Proc. Natl Acad. Sci. USA. 2019;116:3030–3035. doi: 10.1073/pnas.1817322116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zarrella I, et al. The survey and reference assisted assembly of the Octopus vulgaris genome. Sci. Data. 2019;6:13. doi: 10.1038/s41597-019-0017-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Garrett S, Rosenthal JJC. RNA editing underlies temperature adaptation in K+ channels from polar octopuses. Science. 2012;335:848–851. doi: 10.1126/science.1212795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Liscovitch-Brauer N, et al. Trade-off between transcriptome plasticity and genome evolution in cephalopods. Cell. 2017;169:191–202.e11. doi: 10.1016/j.cell.2017.03.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Vandepas LE, Dooley FD, Barord GJ, Swalla BJ, Ward PD. A revisited phylogeography of Nautilus pompilius. Ecol. Evol. 2016;6:4924–4935. doi: 10.1002/ece3.2248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Williams RC, et al. The genetic structure of Nautilus pompilius populations surrounding Australia and the Philippines. Mol. Ecol. 2015;24:3316–3328. doi: 10.1111/mec.13255. [DOI] [PubMed] [Google Scholar]
  • 17.Fedoroff NV. Transposable elements, epigenetics, and genome evolution. Science. 2012;338:758–767. doi: 10.1126/science.338.6108.758. [DOI] [PubMed] [Google Scholar]
  • 18.Wicker T, et al. A unified classification system for eukaryotic transposable elements. Nat. Rev. Genet. 2007;8:973–982. doi: 10.1038/nrg2165. [DOI] [PubMed] [Google Scholar]
  • 19.Ritschard EA, et al. Coupled genomic evolutionary histories as signatures of organismal innovations in cephalopods: co-evolutionary signatures across levels of genome organization may shed light on functional linkage and origin of cephalopod novelties. Bioessays. 2019;41:e1900073. doi: 10.1002/bies.201900073. [DOI] [PubMed] [Google Scholar]
  • 20.Cam HP, Noma K, Ebina H, Levin HL, Grewal SI. Host genome surveillance for retrotransposons by transposon-derived proteins. Nature. 2008;451:431–436. doi: 10.1038/nature06499. [DOI] [PubMed] [Google Scholar]
  • 21.Fachinetti D, et al. DNA sequence-specific binding of CENP-B enhances the fidelity of human centromere function. Dev. Cell. 2015;33:314–327. doi: 10.1016/j.devcel.2015.03.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kocot KM, et al. Phylogenomics reveals deep molluscan relationships. Nature. 2011;477:452–456. doi: 10.1038/nature10382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Bergmann S, Lieb B, Ruth P, Markl J. The hemocyanin from a living fossil, the cephalopod Nautilus pompilius: protein structure, gene organization, and evolution. J. Mol. Evol. 2006;62:362–374. doi: 10.1007/s00239-005-0160-x. [DOI] [PubMed] [Google Scholar]
  • 24.Mutvei H, Zhang Y-B, Dunca E. Late Cambrian plectronocerid nautiloids and their role in cephalopod evolution. Palaeontology. 2007;50:1327–1333. [Google Scholar]
  • 25.Tanner AR, et al. Molecular clocks indicate turnover and diversification of modern coleoid cephalopods during the Mesozoic Marine Revolution. Proc. Biol. Sci. 2017;284:20162818. doi: 10.1098/rspb.2016.2818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Wood R, et al. Integrated records of environmental change and evolution challenge the Cambrian Explosion. Nat. Ecol. Evol. 2019;3:528–538. doi: 10.1038/s41559-019-0821-6. [DOI] [PubMed] [Google Scholar]
  • 27.Elderfield H, et al. Evolution of ocean temperature and ice volume through the mid-Pleistocene climate transition. Science. 2012;337:704–709. doi: 10.1126/science.1221294. [DOI] [PubMed] [Google Scholar]
  • 28.Jansen JHF, Kuijpers A, Troelstra SR. A mid-Brunhes climatic event: long-term changes in global atmosphere and ocean circulation. Science. 1986;232:619–622. doi: 10.1126/science.232.4750.619. [DOI] [PubMed] [Google Scholar]
  • 29.Wang PX, et al. Long-term cycles in the carbon reservoir of the Quaternary ocean: a perspective from the South China Sea. Natl Sci. Rev. 2014;1:119–143. [Google Scholar]
  • 30.Hao Q, et al. Delayed build-up of Arctic ice sheets during 400,000-year minima in insolation variability. Nature. 2012;490:393–396. doi: 10.1038/nature11493. [DOI] [PubMed] [Google Scholar]
  • 31.Kim B-M, et al. Antarctic blackfin icefish genome reveals adaptations to extreme environments. Nat. Ecol. Evol. 2019;3:469–478. doi: 10.1038/s41559-019-0812-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Bowen BW, Muss A, Rocha LA, Grant WS. Shallow mtDNA coalescence in Atlantic pygmy angelfishes (genus Centropyge) indicates a recent invasion from the Indian Ocean. J. Hered. 2006;97:1–12. doi: 10.1093/jhered/esj006. [DOI] [PubMed] [Google Scholar]
  • 33.Pearson JC, Lemons D, McGinnis W. Modulating Hox gene functions during animal body patterning. Nat. Rev. Genet. 2005;6:893–904. doi: 10.1038/nrg1726. [DOI] [PubMed] [Google Scholar]
  • 34.Garcia-Fernàndez J. The genesis and evolution of homeobox gene clusters. Nat. Rev. Genet. 2005;6:881–892. doi: 10.1038/nrg1723. [DOI] [PubMed] [Google Scholar]
  • 35.Wang S, et al. Scallop genome provides insights into evolution of bilaterian karyotype and development. Nat. Ecol. Evol. 2017;1:120. doi: 10.1038/s41559-017-0120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Simakov O, et al. Insights into bilaterian evolution from three spiralian genomes. Nature. 2013;493:526–531. doi: 10.1038/nature11696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Shigeno S, et al. Evolution of the cephalopod head complex by assembly of multiple molluscan body parts: evidence from Nautilus embryonic development. J. Morphol. 2008;269:1–17. doi: 10.1002/jmor.10564. [DOI] [PubMed] [Google Scholar]
  • 38.Huan P, Wang Q, Tan S, Liu B. Dorsoventral decoupling of Hox gene expression underpins the diversification of molluscs. Proc. Natl Acad. Sci. USA. 2020;117:503–512. doi: 10.1073/pnas.1907328117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Nilsson D-E, Arendt D. Eye evolution: the blurry beginning. Curr. Biol. 2008;18:R1096–R1098. doi: 10.1016/j.cub.2008.10.025. [DOI] [PubMed] [Google Scholar]
  • 40.Arendt D. The evolution of cell types in animals: emerging principles from molecular studies. Nat. Rev. Genet. 2008;9:868–882. doi: 10.1038/nrg2416. [DOI] [PubMed] [Google Scholar]
  • 41.Lin Q, et al. The seahorse genome and the evolution of its specialized morphology. Nature. 2016;540:395–399. doi: 10.1038/nature20595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Arendt D, et al. The origin and evolution of cell types. Nat. Rev. Genet. 2016;17:744–757. doi: 10.1038/nrg.2016.127. [DOI] [PubMed] [Google Scholar]
  • 43.Lee MSY, et al. Modern optics in exceptionally preserved eyes of Early Cambrian arthropods from Australia. Nature. 2011;474:631–634. doi: 10.1038/nature10097. [DOI] [PubMed] [Google Scholar]
  • 44.Ogino H, Yasuda K. Induction of lens differentiation by activation of a bZIP transcription factor, L-Maf. Science. 1998;280:115–118. doi: 10.1126/science.280.5360.115. [DOI] [PubMed] [Google Scholar]
  • 45.Reza HM, Yasuda K. Roles of Maf family proteins in lens development. Dev. Dyn. 2004;229:440–448. doi: 10.1002/dvdy.10467. [DOI] [PubMed] [Google Scholar]
  • 46.Sharon-Friling R, et al. Lens-specific gene recruitment of zeta-crystallin through Pax6, Nrl-Maf, and brain suppressor sites. Mol. Cell. Biol. 1998;18:2067–2076. doi: 10.1128/mcb.18.4.2067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Yoshida MA, et al. Molecular evidence for convergence and parallelism in evolution of complex brains of cephalopod molluscs: insights from visual systems. Integr. Comp. Biol. 2015;55:1070–1083. doi: 10.1093/icb/icv049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Ogura A, et al. Loss of the six3/6 controlling pathways might have resulted in pinhole-eye evolution in Nautilus. Sci. Rep. 2013;3:1432. doi: 10.1038/srep01432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Ward P, Carlson B, Weekly M, Brumbaugh B. Remote telemetry of daily vertical and horizontal movement of Nautilus in Palau. Nature. 1984;309:248–250. [Google Scholar]
  • 50.O’Dor RK, Forsythe J, Webber DM, Wells J, Wells MJ. Activity levels of Nautilus in the wild. Nature. 1993;362:626–628. [Google Scholar]
  • 51.Porter ML, et al. Shedding new light on opsin evolution. Proc. Biol. Sci. 2012;279:3–14. doi: 10.1098/rspb.2011.1819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Ramirez MD, et al. The last common ancestor of most bilaterian animals possessed at least nine opsins. Genome Biol. Evol. 2016;8:3640–3652. doi: 10.1093/gbe/evw248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Marshall NJ, Messenger JB. Colour-blind camouflage. Nature. 1996;382:408–409. [Google Scholar]
  • 54.Nilsson D-E. The evolution of eyes and visually guided behaviour. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2009;364:2833–2847. doi: 10.1098/rstb.2009.0083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Arshavsky VY, Lamb TD, Pugh EN. G proteins and phototransduction. Annu. Rev. Physiol. 2002;64:153–187. doi: 10.1146/annurev.physiol.64.082701.102229. [DOI] [PubMed] [Google Scholar]
  • 56.Jacobson SG, et al. Identifying photoreceptors in blind eyes caused by RPE65 mutations: prerequisite for human gene therapy success. Proc. Natl Acad. Sci. USA. 2005;102:6177–6182. doi: 10.1073/pnas.0500646102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Hara T, et al. Rhodopsin and retinochrome in the retina of a tetrabranchiate cephalopod, Nautilus pompilius. Zoolog. Sci. 1995;12:195–201. [Google Scholar]
  • 58.Jin MH, Li S, Moghrabi WN, Sun H, Travis GH. Rpe65 is the retinoid isomerase in bovine retinal pigment epithelium. Cell. 2005;122:449–459. doi: 10.1016/j.cell.2005.06.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Moiseyev G, Chen Y, Takahashi Y, Wu BX, Ma J-X. RPE65 is the isomerohydrolase in the retinoid visual cycle. Proc. Natl Acad. Sci. USA. 2005;102:12413–12418. doi: 10.1073/pnas.0503460102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Marin F, Luquet G, Marie B, Medakovic D. Molluscan shell proteins: primary structure, origin, and evolution. Curr. Top. Dev. Biol. 2008;80:209–276. doi: 10.1016/S0070-2153(07)80006-8. [DOI] [PubMed] [Google Scholar]
  • 61.Schoeppler V, et al. Crystal growth kinetics as an architectural constraint on the evolution of molluscan shells. Proc. Natl Acad. Sci. USA. 2019;116:20388–20397. doi: 10.1073/pnas.1907229116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Vendrasco MJ, Checa AG, Kouchinsky AV. Shell microstructure of the early bivalve Pojetaia and the independent origin of nacre within the mollusca. Palaeontology. 2011;54:825–850. [Google Scholar]
  • 63.Zhang G, et al. The oyster genome reveals stress adaptation and complexity of shell formation. Nature. 2012;490:49–54. doi: 10.1038/nature11413. [DOI] [PubMed] [Google Scholar]
  • 64.Zhao R, et al. Dual gene repertoires for larval and adult shells reveal molecules essential for molluscan shell formation. Mol. Biol. Evol. 2018;35:2751–2761. doi: 10.1093/molbev/msy172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Aguilera F, McDougall C, Degnan BM. Co-option and de novo gene evolution underlie molluscan shell diversity. Mol. Biol. Evol. 2017;34:779–792. doi: 10.1093/molbev/msw294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Hilgers L, Hartmann S, Hofreiter M, von Rintelen T. Novel genes, ancient genes, and gene co-option contributed to the genetic basis of the radula, a molluscan innovation. Mol. Biol. Evol. 2018;35:1638–1652. doi: 10.1093/molbev/msy052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Marie B, et al. Nautilin-63, a novel acidic glycoprotein from the shell nacre of Nautilus macromphalus. FEBS J. 2011;278:2117–2130. doi: 10.1111/j.1742-4658.2011.08129.x. [DOI] [PubMed] [Google Scholar]
  • 68.Jackson DJ, et al. Parallel evolution of nacre building gene sets in molluscs. Mol. Biol. Evol. 2010;27:591–608. doi: 10.1093/molbev/msp278. [DOI] [PubMed] [Google Scholar]
  • 69.Sudo S, et al. Structures of mollusc shell framework proteins. Nature. 1997;387:563–564. doi: 10.1038/42391. [DOI] [PubMed] [Google Scholar]
  • 70.O’Neill LAJ, Bowie AG. The family of five: TIR-domain-containing adaptors in Toll-like receptor signalling. Nat. Rev. Immunol. 2007;7:353–364. doi: 10.1038/nri2079. [DOI] [PubMed] [Google Scholar]
  • 71.Chen G, Goeddel DV. TNF-R1 signaling: a beautiful pathway. Science. 2002;296:1634–1635. doi: 10.1126/science.1071924. [DOI] [PubMed] [Google Scholar]
  • 72.Zhang L, et al. Massive expansion and functional divergence of innate immune genes in a protostome. Sci. Rep. 2015;5:8693. doi: 10.1038/srep08693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Sharon N, Lis H. Lectins: cell-agglutinating and sugar-specific proteins. Science. 1972;177:949–959. doi: 10.1126/science.177.4053.949. [DOI] [PubMed] [Google Scholar]
  • 74.MacMicking JD. IFN-inducible GTPases and immunity to intracellular pathogens. Trends Immunol. 2004;25:601–609. doi: 10.1016/j.it.2004.08.010. [DOI] [PubMed] [Google Scholar]
  • 75.Kim B-H, Shenoy AR, Kumar P, Bradfield CJ, MacMicking JD. IFN-inducible GTPases in host cell defense. Cell Host Microbe. 2012;12:432–444. doi: 10.1016/j.chom.2012.09.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Xu H, et al. FastUniq: a fast de novo duplicates removal tool for paired short reads. PLoS ONE. 2012;7:e52249. doi: 10.1371/journal.pone.0052249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Koren S, et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27:722–736. doi: 10.1101/gr.215087.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Salmela L, Rivals E. LoRDEC: accurate and efficient long read error correction. Bioinformatics. 2014;30:3506–3514. doi: 10.1093/bioinformatics/btu538. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Khan Z, Bloom JS, Kruglyak L, Singh M. A practical algorithm for finding maximal exact matches in large sequence datasets using sparse suffix arrays. Bioinformatics. 2009;25:1609–1616. doi: 10.1093/bioinformatics/btp275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Ye C, Ma ZS. Sparc: a sparsity-based consensus algorithm for long erroneous sequencing reads. PeerJ. 2016;4:e2016. doi: 10.7717/peerj.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Chakraborty M, Baldwin-Brown JG, Long AD, Emerson JJ. Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage. Nucleic Acids Res. 2016;44:e147. doi: 10.1093/nar/gkw654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Gnerre S, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc. Natl Acad. Sci. USA. 2011;108:1513–1518. doi: 10.1073/pnas.1017351108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–3212. doi: 10.1093/bioinformatics/btv351. [DOI] [PubMed] [Google Scholar]
  • 85.Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35:W265–W268. doi: 10.1093/nar/gkm286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Han Y, Wessler SR. MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res. 2010;38:e199. doi: 10.1093/nar/gkq862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics. 2005;21:i351–i358. doi: 10.1093/bioinformatics/bti1018. [DOI] [PubMed] [Google Scholar]
  • 88.Edgar RC, Myers EW. PILER: identification and classification of genomic repeats. Bioinformatics. 2005;21:i152–i158. doi: 10.1093/bioinformatics/bti1003. [DOI] [PubMed] [Google Scholar]
  • 89.Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA. 2015;6:11. doi: 10.1186/s13100-015-0041-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinform. 2009;Chapter 4:Unit 4.10. doi: 10.1002/0471250953.bi0410s25. [DOI] [PubMed] [Google Scholar]
  • 91.Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 2004;20:2878–2879. doi: 10.1093/bioinformatics/bth315. [DOI] [PubMed] [Google Scholar]
  • 92.Keilwagen J, et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 2016;44:e89. doi: 10.1093/nar/gkw092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 1997;268:78–94. doi: 10.1006/jmbi.1997.0951. [DOI] [PubMed] [Google Scholar]
  • 94.Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003;19:ii215–ii225. doi: 10.1093/bioinformatics/btg1080. [DOI] [PubMed] [Google Scholar]
  • 95.Blanco E, Parra G, Guigó R. Using geneid to identify genes. Curr. Protoc. Bioinform. 2007;Chapter 4:Unit 4.3. doi: 10.1002/0471250953.bi0403s18. [DOI] [PubMed] [Google Scholar]
  • 96.Korf I. Gene finding in novel genomes. BMC Bioinform. 2004;5:59. doi: 10.1186/1471-2105-5-59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Grabherr MG, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 2011;29:644–652. doi: 10.1038/nbt.1883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Haas BJ, et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 2003;31:5654–5666. doi: 10.1093/nar/gkg770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Tatusov RL, et al. The COG database: an updated version includes eukaryotes. BMC Bioinform. 2003;4:41. doi: 10.1186/1471-2105-4-41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Boeckmann B, et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 2003;31:365–370. doi: 10.1093/nar/gkg095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Conesa A, et al. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–3676. doi: 10.1093/bioinformatics/bti610. [DOI] [PubMed] [Google Scholar]
  • 103.Li L, Stoeckert CJ, Jr., Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–2189. doi: 10.1101/gr.1224503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  • 107.De Bie T, Cristianini N, Demuth JP, Hahn MW. CAFE: a computational tool for the study of gene family evolution. Bioinformatics. 2006;22:1269–1271. doi: 10.1093/bioinformatics/btl097. [DOI] [PubMed] [Google Scholar]
  • 108.Kumar S, Stecher G, Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 2016;33:1870–1874. doi: 10.1093/molbev/msw054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Venkatesh B, et al. Elephant shark genome provides unique insights into gnathostome evolution. Nature. 2014;505:174–179. doi: 10.1038/nature12826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.MacLeod IM, Larkin DM, Lewin HA, Hayes BJ, Goddard ME. Inferring demography from runs of homozygosity in whole-genome sequence, with correction for sequence errors. Mol. Biol. Evol. 2013;30:2209–2223. doi: 10.1093/molbev/mst125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Keilwagen J, Hartung F, Grau J. GeMoMa: homology-based gene prediction utilizing intron position conservation and RNA-seq data. Methods Mol. Biol. 2019;1962:161–177. doi: 10.1007/978-1-4939-9173-0_9. [DOI] [PubMed] [Google Scholar]
  • 112.Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25:1105–1111. doi: 10.1093/bioinformatics/btp120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Trapnell C, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 2012;7:562–578. doi: 10.1038/nprot.2012.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Marie B, et al. The shell-forming proteome of Lottia gigantea reveals both deep conservations and lineage-specific novelties. FEBS J. 2013;280:214–232. doi: 10.1111/febs.12062. [DOI] [PubMed] [Google Scholar]
  • 115.Mu HW, Sun J, Heras H, Chu KH, Qiu J-W. An integrated proteomic and transcriptomic analysis of perivitelline fluid proteins in a freshwater gastropod laying aerial eggs. J. Proteomics. 2017;155:22–30. doi: 10.1016/j.jprot.2017.01.006. [DOI] [PubMed] [Google Scholar]
  • 116.Yau K-W, Hardie RC. Phototransduction motifs and variations. Cell. 2009;139:246–264. doi: 10.1016/j.cell.2009.09.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117.Nikolaeva O, Takahashi Y, Moiseyev G, Ma J-X. Purified RPE65 shows isomerohydrolase activity after reassociation with a phospholipid membrane. FEBS J. 2009;276:3020–3030. doi: 10.1111/j.1742-4658.2009.07021.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (3.6MB, pdf)

Supplementary Figs. 1–19.

Reporting Summary (69.6KB, pdf)
Supplementary Tables (51.8KB, xlsx)

Supplementary Tables 1–15.

Data Availability Statement

The nautilus genome project has been deposited with the NCBI under the BioProject number PRJNA614552. The whole-genome sequencing data were deposited with the sequence read archive (SRA) database under accession nos. SRR11485669SRR11485706. The RNA-seq data from various tissue transcriptomes have also been deposited with the SRA database under accession nos. SRR11485678SRR11485687. Gene annotation data have been deposited in the Genome Warehouse database of the Genome Sequence Archive (GSA) under accession no. GWHBECW00000000.


Articles from Nature Ecology & Evolution are provided here courtesy of Nature Publishing Group

RESOURCES