Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2022 Sep 4;23(1):294–311. doi: 10.1111/1755-0998.13702

Genomic investigation of the Chinese alligator reveals wild‐extinct genetic diversity and genomic consequences of their continuous decline

Shangchen Yang 1, Tianming Lan 2,3, Yi Zhang 1, Qing Wang 2,4, Haimeng Li 2,4, Nicolas Dussex 5,6,7, Sunil Kumar Sahu 2, Minhui Shi 2,4, Mengyuan Hu 1, Yixin Zhu 2,4, Jun Cao 8,9, Lirong Liu 8,9, Jianqing Lin 1, Qiu‐Hong Wan 1,, Huan Liu 2,3,, Sheng‐Guo Fang 1,
PMCID: PMC10087395  PMID: 35980602

Abstract

Critically endangered species are usually restricted to small and isolated populations. High inbreeding without gene flow among populations further aggravates their threatened condition and reduces the likelihood of their long‐term survival. Chinese alligator (Alligator sinensis) is one of the most endangered crocodiles in the world and has experienced a continuous decline over the past c. 1 million years. In order to identify the genetic status of the remaining populations and aid conservation efforts, we assembled the first high‐quality chromosome‐level genome of Chinese alligator and explored the genomic characteristics of three extant breeding populations. Our analyses revealed the existence of at least three genetically distinct populations, comprising two breeding populations in China (Changxing and Xuancheng) and one breeding population in an American wildlife refuge. The American population does not belong to the last two populations of its native range (Xuancheng and Changxing), thus representing genetic diversity extinct in the wild and provides future opportunities for genetic rescue. Moreover, the effective population size of these three populations has been continuously declining over the past 20 ka. Consistent with this decline, the species shows extremely low genetic diversity, a large proportion of long runs of homozygous fragments, and mutational load across the genome. Finally, to provide genomic insights for future breeding management and conservation, we assessed the feasibility of mixing extant populations based on the likelihood of introducing new deleterious alleles and signatures of local adaptation. Overall, this study provides a valuable genomic resource and important genomic insights into the ecology, evolution, and conservation of critically endangered alligators.

Keywords: Alligator sinensis, conservation, critically endangered species, genetic diversity, inbreeding, mutational load, ROH

1. INTRODUCTION

A large proportion of earth biodiversity is severely impacted by human activities and endangered species are forced to survive in small and isolated populations (Haddad et al., 2015). Indian tiger and Scandinavian wolf (Kardos et al., 2018) are representative examples threatened with local extinction. These populations usually suffer from reduced genetic diversity and severe inbreeding, leading to a reduced potential to adapt to environmental changes (Kardos et al., 2018; Khan et al., 2021). Moreover, inbreeding will lead to the exposure of deleterious alleles in homozygous state, thus reducing the survival of individuals via inbreeding depression. Assisted gene flow via translocations from other populations is a promising way to maintain and/or recover small populations and thus induce a genetic rescue effect (Foote et al., 2019). For example, translocations of the puma subspecies (Puma concolor) successfully improved the survival and fitness of the receiving population (Saremi et al., 2019).

It is worth noting that assisted gene flow poses a potential risk of outbreeding depression when mixing long‐term isolated populations. Because isolated populations may have been exposed to different evolutionary pressure and have fixed different adaptive alleles, the combination of distinct haplotypes may interfere with the interaction between genes (Frankham, 2005; Kelle & Waller, 2002). Moreover, there is a likelihood of introducing new deleterious alleles that would increase the mutational load of the receiving population (Kyriazis et al., 2020; Robinson et al., 2019). Scientific management and conservation require a full understanding of the genetic status of all populations of a target species (Hedrick & Garcia‐Dorado, 2016).

The Chinese alligator is one of the species on the brink of extinction and is currently listed as “critically endangered” on the IUCN Red List (Jiang & Wu, 2018). It belongs to an ancient reptilian lineage that has survived since the Mesozoic, and once inhabited a large area of wetlands, marshes and ponds of the lower Yangtze River (Thorbjarnarson et al., 2002). However, this species was restricted to the border area of Anhui, Jiangsu and Zhejiang Provinces by the 1900s. At present, no more than 100 mature Chinese alligators remain in the wild, with a fragmented distribution in Xuancheng, Jingxian, Guangde, Nanling, and Longxi, five narrow regions in Anhui Province (Jiang & Wu, 2018). According to a census data, their age structure is unbalanced and egg laying performance has declined in the wild populations (Ding et al., 2001).

Conservation and breeding projects were carried out at the Anhui Research Centre of Chinese Alligator Reproduction (hereafter Anhui Centre, acronym XC) and Changxing Yinjiabian Chinese Alligator Nature Reserve (hereafter Changxing Centre, acronym CX in 1979, based on 212 and 11 wild founders, respectively (Wu et al., 2002). In addition, breeding programmes were also implemented at the St. Augustine Alligator Farm, Bronx Zoo, and the Rockefeller Refuge in America (acronym ACA) (Ross et al., 1998). Although the overall population has been increasing, their genetic diversity was reported to be very low (Wan et al., 2013; Wu et al., 2002; Zhai et al., 2017) and inbreeding within each population seemed unavoidable (Wu et al., 2006). Indications of inbreeding depression have both emerged in Anhui and Changxing Centres, characterized by reduced reproductive ability and physical deformities in offspring (Wu et al., 1999, 2006). Translocation programs have been carried out to help increase the fitness of the small CX population. To weigh the pros and cons, it is essential to examine the genomic background of extant populations, which would aid in managing measures for the conservation of Chinese Alligators.

Here, we assembled and annotated the first chromosome‐scale genome for Chinese alligator, and sequenced 23 samples from three isolated populations. To clarify the genetic relationship and survival potential of different populations, we investigated population structure, demographic history, genetic diversity, inbreeding status, mutational load and local adaptation. This study will provide a valuable genomic resource and important genomic insights into the ecology, evolution, and conservation of this critically endangered species.

2. MATERIALS AND METHODS

2.1. Samples collection and ethics statement

The blood sample was collected from an adult Chinese alligator reared at Changxing Centre, China. The blood sample was divided into four tubes with 2.5 ml for each tube for PacBio long‐read sequencing, Illumina short‐read sequencing, RNA‐seq sequencing, and Hi‐C sequencing. We also collected blood or umbilical cord samples (2 ml for each individual) from 23 wild or semi‐wild individuals for resequencing, including eight individuals from CX (including a wild founder “CX1”), nine individuals from XC and six individuals from ACA (Figure 2a, Table S1). Research and sample collection were both approved by the Animal Ethics Committee of Zhejiang University (ZJU20210267) and the Institutional Review Board of BGI (BGI‐IRB E22002).

FIGURE 2.

FIGURE 2

Distribution and genetic population structure of Chinese alligator populations. (a) Distribution of Chinese alligators in China and translocated individuals in America and sampling locations in this study. n represents the number of samples. Pictures above and below show original and external distribution, respectively. (b) Principal component analysis of 23 individuals showing the first and second principal components. Pairwise F ST is added near the bidirectional arrow. Asterisk represents the founder of CX population. (c) Inferred population genetic structure of the 23 individuals using the maximum likelihood method with a model with two to four ancestral components. (d) Unrooted tree constructed using the neighbor‐joining method from biallelic SNPs among 23 Chinese alligators and A. mississippiensis (acronym AM). The p‐distance is indicated by the scale bar. (e) F3 statistics for all three populations to detect the potentially admixed relationships.

2.2. DNA and RNA extraction, library construction and genome sequencing

High molecular weight genomic DNA was extracted from blood samples using the DNeasy Blood and Tissue kit (Qiagen) for PacBio sequencing, with an average insert size of 20 kb. SMRTbell libraries were constructed using SMRTbell Template Pre‐Kits (Pacific Biosciences), according to the manufacturer's instructions. Genomic DNA used for resequencing and genome analyses was extracted using a phenol‐chloroform protocol together with ethanol precipitation (Sambrook et al., 1989). DNA libraries with short insert sizes were constructed according to the manufacturer's instructions of the Illumina sequencing platform. For Hi‐C sequencing, we first performed cross‐link process with formaldehyde for the blood sample, and then the Hi‐C library was constructed according to the procedures of Lieberman‐Aiden et al. (2009). Total RNA was isolated using TRlzol reagent (Invitrogen), and Agilent 2100 Bioanalyser system (Agilent) and Qubit 3.0 (Life Technologies) were used for RNA quantity, integrity, and purity evaluation. DNA libraries of RNA‐seq, whole‐genome resequencing and genome analyses were all sequenced on the Illumina HiSeq X 10 system (Illumina) (Tables S1 and S2).

2.3. De novo assembly and assessment

Genome size and heterozygosity of the Chinese alligator were estimated by k‐mer frequency method (Lander & Waterman, 1988). We first assembled an initial genome with error‐corrected PacBio long reads based on the Overlap‐Layout‐Consensus algorithm (Li et al., 2011). Daligner in Falcon (version 0.5) software (Chin et al., 2016) was used to map all PacBio reads to the longest single‐pass reads and then LASort, LAMerge and pbdacgon were used to generate consensus of mapped reads. Contigs were polished using Quiver (version 2.3.1) consensus‐calling algorithm (Chin et al., 2013) with PacBio long reads. Contigs were further corrected by the Pilon (version 1.18) (Walker et al., 2014) software with 276.83 Gb Illumina short reads. Hi‐C reads were first filtered using the program “filter_data_parallel” in the SOAPdenovo2 package (r240) (Luo et al., 2012) before genome mapping. We then mapped Hi‐C clean reads to the draft assembly using Burrows‐Wheeler aligner mem (BWA, version 0.7.17) (Li & Durbin, 2010) with default parameters. Finally, 3d‐DNA pipeline (version 180,922) (Durand et al., 2016) was used to concatenate the contigs to the chromosome‐level genome. All Illumina short reads were remapped to the final assembly for error‐correcting of the misassembled bases. The completeness of the genome was evaluated by BUSCO analysis using the vertebrata_odb10 database. We also mapped the Illumina short reads, RNA‐seq data and Hi‐C data to our assembled genome by BWA software with default parameters.

2.4. Genome annotation

De novo prediction and homology‐based method were both used for repetitive elements annotation in our Chinese alligator genome. De novo prediction was performed using RepeatModeler2 (version 1.0.9) (Flynn et al., 2020) with default parameters. De novo predicted repetitive elements were then added into the RepBase as known repeats. RepeatMasker (version 4.1.1) (Tarailo‐Graovac & Chen, 2009) was finally carried out by searching in RepBase library (Jurka et al., 2005) for identifying and classifying transposable elements. Tandem Repeats Finder (TRF version 4.09) (Benson, 1999) was also used to identify tandem repeats.

Gene annotation was performed based on repeat‐masked genome. Gene prediction was carried out following the procedure described here (Wang et al., 2017). Briefly, de novo gene prediction, RNA‐seq method and homologous proteins alignment were used to annotate protein‐coding genes for our genome with Maker (version 2.31.11) (Campbell et al., 2014). We used SNAP (version 1.0) (Korf, 2004), Genescan (version 1.0) (Burge & Karlin, 1997), glimmerHMM (version 3.0.3) (Majoros et al., 2004) and AUGUSTUS (version 2.5.5) (Keller et al., 2011) for de novo gene prediction to identify protein‐coding genes in our assembled genome. We collected protein sequences from Anolis carolinensis, Alligator mississippiensis, A. sinensis (GenBank ID: GCF_000455745.1), Crocodylus porosus, Gavialis gangeticus, Gallus gallus, Homo sapiens, Meleagris gallopavo, Taeniopygia guttata and Xenopus tropicalis in the NCBI database for homology‐based predictions. GeneWise (version 2.2.0) (Birney et al., 2004) was used for gene model prediction. Transcripts after filtering and assembling by Trimmomatic (version 0.27) (Bolger et al., 2014) and Trinity (version 2.9.0) (Haas et al., 2013) were aligned to our genome by program to assemble spliced alignments (PASA) (version 2.2.0) (Haas et al., 2008) to predict gene structures. The final consensus gene set was obtained by combining the above‐mentioned three gene sets by Maker (version 2.31.11) (Campbell et al., 2014). All protein‐coding genes were aligned to databases of InterPro (Apweiler et al., 2001), KEGG (Kanehisa & Goto, 2000), Swiss‐Prot and GO for the functional annotation.

2.5. Syntenic analysis

To examine the synteny between the GCF_000455745.1 genome and our assembly, we firstly performed the whole‐genome alignment using LAST (version 973) (Kielbasa et al., 2011) software with the following parameters: lastdb ‐uNEAR ‐cR11; lastal ‐P16 ‐m100 ‐E0.05; last‐split ‐m1. Synteny blocks from the GCF_000455745.1, which were aligned to the same chromosome in our assembly, were firstly sorted together and then visualized using Circos (version 0.69–9) (Krzywinski et al., 2009) software.

2.6. Divergence time estimation

We first performed protein alignment of 15 species (A. sinensis, A. carolinensis, A. mississippiensis, C. porosus, G. gangeticus, G. gallus, H. sapiens, M. gallopavo, T. guttata, X. tropicalis, Ophiophagus hannah (GCA_000516915.1), Gekko japonicus (GCA_001447785.1), Pelodiscus sinensis (GCA_000230535.1), Chrysemys picta (GCA_000241765.2) and Chelonia mydas (GCA_000344595.1)) by blastp in BLASTtools (version 2.2.26) (Altschul et al., 1990) with default parameters and clustered by Orthomcl (version 1.4) (Li et al., 2003) with the inflation parameter of 1.5. A total of 2595 single‐copy genes shared by all species were identified and then used to build a phylogenetic tree by IQTREE (version 1.6.12) (Lam‐Tung et al., 2015) with the maximum‐likelihood method. We used the MCMCTREE (version 4.5) in the PAML software (Yang, 2007) to estimate the divergence time among species with the parameter “burnin = 1000, sample‐number = 1000,000, sample‐frequency = 2”. Multiple fossil time points were used for time calibrations from Timetree (http://www.timetree.org/) (Table S3).

2.7. Read mapping and variant calling

Raw sequencing reads from the 23 Chinese alligator individuals were filtered with Trimmomatic (version 0.27) (Bolger et al., 2014). Low‐quality reads, reads with more than 10% “Ns” and reads with adaptor sequences were filtered out. Clean reads were mapped to our improved assembly with BWA mem with default parameters and one bam file was generated for each individual. SAMtools (version 1.3) was then used for sorting, indexing, and removing duplicates from bam files. For variant calling, we first used Samtools to generate a raw variant call format (vcf) file with the strict “mpileup ‐q 1 ‐C 50 ‐t SP ‐t DP ‐m 2 ‐F 0.002” (Li et al., 2009) to conduct BQSR using the vcf file as reference variants. Next, the genome variant call format (gvcf) file for each individual was generated by using the Genome Analysis Toolkit (GATK version 4.0.3.0) (Depristo et al., 2011) with the function of HaplotypeCaller. Joint calling was then performed to generate the combined VCF file. Hard filtering was applied to get the single‐nucleotide polymorphism (SNP) sites set with “QUAL < 30.0 || QD < 2.0 || MQ < 40.0 || FS > 60.0 || SOR > 3.0 || HaplotypeScore > 13.0 || MQRankSum < −12.5 || ReadPosRankSum < −8.0”. We also filtered out biallelic SNPs with the highest and lowest 0.25% depth. In the population structure analysis, we only maintained the loci with a missing ratio less than 10% and minor allele frequency (MAF) larger than 0.05 and finally avoided the bias caused by linkage disequilibrium (LD) by Plink (version 1.9) (Chang et al., 2015) with the parameter “‐‐indep‐pairwise 10 kb 1 0.5”.

2.8. Population structure analysis

Pairwise Weir and Cockerham's F ST (Weir & Cockerham, 1984) for the three populations were calculated using VCFtools (version 0.1.16) (Danecek et al., 2011). VCFtools was used to convert VCF files to plink format files to conduct principal component analysis (PCA) with Plink (version 1.9). We used the program ADMIXTURE (version 1.3.0) (Alexander et al., 2009) to infer genetic clusters representing distinct ancestral components. Values of 1–5 were run with “–cv” flag to compute the cross‐validation error and to infer the most likely value of K. To analyse the phylogenetic relationship, we first aligned A. mississippiensis genome against our assembly using the LAST (version 973) software with following parameters: “lastdb ‐uNEAR ‐cR11; lastal ‐P16 ‐m100 ‐E0.05; last‐split ‐m1” to identify conserved regions between the two alligator genomes. Then the alleles of A. mississippiensis were added to the vcf file of Chinese alligators and finally kept 312,525 SNPs in these conserved regions. A phylogenetic tree was then constructed with A. mississippiensis serving as the outgroup using IQTREE (version 1.6.12) with 1000 bootstraps. The tree layout was generated using the online tool iTOL (http://itol.embl.de). We then performed F3 statistics using qp3Pop implemented in ADMIXTOOLS (version 5.1) (Patterson et al., 2012) to test if one population was an admixed population of the other two. LD was calculated on SNP pairs within a 1000‐kb window using PopLDdecay (version 3.40) (Zhang, Dong, et al., 2018).

2.9. Demographic inference and population divergence

We combined multiple sequentially Markovian coalescent (MSMC) and approximate Bayesian computation (ABC) methods to track fluctuations in effective population size (N e) from 10 ka BP to the present day for the three populations, considering the different limitations and resolutions of each approach. Firstly, SNPs were phased by BEAGLE (version 5.0) (Browning et al., 2018) with default parameters. We randomly selected four individuals from each population and masked uncovered regions with bamCaller.py for them. MSMC (Schiffels & Durbin, 2014) was then run with following parameters: ‐R ‐i 20 ‐t 6 ‐p ‘10*1 + 15*2’. The final result was visualized with a generation time of 20 years and the mutation rate of 7.9*10−9 substitutions per site per generation (Green et al., 2014). To further infer the most recent population history of Chinese alligator, we only used SNPs with a MAF > 0.2 in the VCF file to run PopSizeABC (version 2.1) (Boitard et al., 2016) with following parameters: mac (minor allele count threshold for AFS and IBS statistics computation) = 0; mac_ld (minor allele count threshold for LD statistics computation) equals 2,3,4 respectively; L (size of each segment, in bp) = 4,000,000; nb_rep (number of simulated data sets) = 500; nb_seg (number of independent segments in each data set) = 30.

Population divergence was inferred using four randomly selected samples from each population by MSMC2 (version 2.1.2) (Schiffels & Durbin, 2014) with following parameters: ‐‐skipAmbiguous ‐I 0–8,0‐9,0‐10,0‐11,0‐12,0‐13,0‐14,0‐15,1‐8,1‐9,1‐10,1‐11,1‐12,1‐13,1‐14,1‐15,2‐8,2‐9,2‐10,2‐11,2‐12,2‐13,2‐14,2‐15,3‐8,3‐9,3‐10,3‐11,3‐12,3‐13,3‐14,3‐15,4‐8,4‐9,4‐10,4‐11,4‐12,4‐13,4‐14,4‐15,5‐8,5‐9,5‐10,5‐11,5‐12,5‐13,5‐14,5‐15,6‐8,6‐9,6‐10,6‐11,6‐12,6‐13,6‐14,6‐15,7‐8,7‐9,7‐10,7‐11,7‐12,7‐13,7‐14,7–15 ‐i 20 ‐t 6 ‐p ‘28*1 + 1*2’. When the relative cross‐coalescent rate (RCCR) dropped to 0.5, the split between pairwise populations was estimated to occur at corresponding time point.

2.10. Gene flow

We performed a D‐statistics test (A, B; X, Y) by using qpDstat in ADMIXTOOLS, where we set A. mississippiensis as Y, XC as X, CX and ACA as A and B, respectively. TreeMix (version 1.13) (Pickrell & Pritchard, 2012) was then used to infer models of population split and migration between different populations by setting A. mississippiensis as an outgroup with the parameter “‐m 1‐5 ‐k 1000 ‐root AM”. Identity by descent (IBD) analysis was performed in refined‐ibd software (16May19.ad5.jar) (Browning & Browning, 2013) with default parameters. Gene flow within and among populations was then estimated by the pairwise IBD fragments from different generations (g). We inferred generations with the equation l = 100/(2 g), where l was the length of IBD in cM (Thompson, 2013). The estimated recombination rate (1 Mb = 0.89 cM) was calculated by dividing the overall recombination rate map length by whole genome size (Stapley et al., 2017) from a genetic linkage map of a female C. porosus based on microsatellite markers (Miles et al., 2009). We assumed that the recombination rate was conserved between Chinese alligator and C. porosus, considering that crocodilian families had similar karyotype, genome size and evolutionary rate.

2.11. Genetic diversity

We quantified genetic diversity by estimating genome‐wide heterozygous rate (H) and nucleotide diversity (π). Whole‐genome H of each sample was by calculated dividing the total number of heterozygous SNPs by successfully assembled autosomal genome size (Cho et al., 2013). H of genic, intron, exon, CDS and UTR regions were also calculated. Comparison of different populations and genomic regions were conducted using two‐sided pairwise t‐test in R (version 4.1.2) (R Development Core Team, 2012). π was estimated in 5‐Mbp windows (Feng et al., 2019) across all autosomes by VCFtools (version 0.1.16) (Danecek et al., 2011). GO enrichment analyses were performed on the diversity hotspot regions with top 20 π values in R by using the package “clusterProfiler” (Wu et al., 2021; Yu et al., 2012). Each significantly enriched category included at least two genes, and the hypergeometric test was used to estimate significance (p < .05).

2.12. Inference of inbreeding history

Runs of homozygosity (ROH) were identified with Plink (version 1.9) (Chang et al., 2015) with parameters: ‐‐homozyg‐window‐snp 20 ‐‐homozyg‐kb 10 ‐‐homozyg‐density 250. The individual inbreeding coefficient FROH was then estimated as the overall proportion of the genome regions contained in ROH divided by genome size. The generation time of ROH was inferred by the same method used for IBD above.

2.13. Mutational load

The mutational load was quantified here to explore the potential genetic threats to the fitness and survival of Chinese alligator. To avoid a reference bias when comparing three populations, we used A. mississippiensis as the ancestral allele to replace the Chinese alligator reference allele in the conserved regions identified in section 2.8. The software SnpEff (version 4.3) (Cingolani et al., 2012) was used to annotate the retained 312,525 SNPs to identify three different categories of mutations, including: (1) missense mutations, (2) loss of function (LOF) mutations and (3) synonymous mutations. We considered “stop_gained”, “splice_donor_variant” and “splice_acceptor_variant” as LOF mutations. We then diagnosed the deleterious nonsynonymous SNPs (dnsSNP) by calculating Grantham score (GS) (Grantham, 1974) a measurement on the physical/chemical properties of amino acid changes. GS for nonsynonymous SNPs was calculated by using ANNOVAR (version 2020Jun08) software (Wang et al., 2010) with the parameter “‐‐aamatrixfile grantham matrix” and when GS score > 150, the mutation can be designated as deleterious (Li et al., 1984). The ratio of homozygous to homozygous and heterozygous derived alleles and the number of these variants were then compared using two‐sided pairwise t‐test in R (version 4.1.2) (R Development Core Team, 2012). The contribution of inbreeding in the accumulation of homozygous missense mutations (Rnhom) was calculated by dividing the ratio of missense to synonymous counts of homozygous derived alleles inside ROH by the corresponding ratio outside ROH (Wang et al., 2021).

We predicted the risk of assisted gene flow (i.e., assuming successful post‐translocation mating) by counting newly introduced deleterious alleles as in rhinoceros populations (von Seth et al., 2021). We showed the number of unique homozygous LOF and dnsSNP in each individual selected for translocation, while the mutations were absent in all individuals of the receiving population. We also performed a statistic on shared and unique nonsynonymous mutations for all pairwise individuals within and between populations.

2.14. Evidence for recent positive selection

We identified all candidate SNPs under recently positive selection in each population by applying the integrated haplotype score (iHS, version 1.3) (Voight et al., 2006) with the major allele in three populations as the ancestral state. The iHS values were normalized by subtracting the genome‐wide mean iHS and dividing by the standard deviation (whole‐genome homozygosity analysis and mapping machina [WHAMM], http://coruscant.itmat.upenn.edu/whamm/index.html). Sites with an iHS score above or below the threshold (top or bottom 1%) were considered as candidate ancestral or derived mutations under positive selection. We used four approaches to identify genes in these candidate regions: (1) sliding 100‐kb windows by 50‐kb step on the whole genome and summing up iHS scores of candidate SNPs in each window. Genes intersecting with these windows were sorted by the iHS score; (2) Genes were selected in a 5‐kb flanking region around each candidate mutation; (3) Nonoverlapping 50‐SNP windows were used to select genes; (4) Each gene with candidate mutations was used to count iHS score. Finally, genes detected by all four methods were considered under recently positive selection.

The population branch statistic (PBS) was then performed to investigate the recent selective effect (Yi et al., 2010) in each population. We estimated F ST for each gene between pairwise populations in VCFtools and used 0.999999 to replace 1 to avoid infinite PBS value. The divergence specific to the branch for each gene was calculated by the following formulas:

t12=log1FST12 (1)
t13=log1FST13 (2)
t23=log1FST23 (3)
pbs=t12+t13t23/2 (4)

All genes with a PBS value larger than the 99.8th quartile of the distribution of the PBS values were reported.

3. RESULTS

3.1. Improved de novo assembly of the Chinese alligator

We assembled a chromosome‐scale genome of the Chinese alligator with high quality, contiguity, and accuracy by using a combination of Illumina short reads (135.65 Gb, 120‐fold), PacBio long reads (230.03 Gb, 100‐fold) and Hi‐C reads (309.80 Gb, 135‐fold) (Table S2). Genome size was estimated to be 2.42 Gb with a heterozygosity rate of 0.07% by calculating the frequency of 17‐mer using 135.65 Gb Illumina short reads (Figure S1 and Table S4). The total length of our assembly was 2.30 Gb, accounting for 95% of the estimated genome size. The contig N50 and scaffold N50 were 22.53 Mb and 219.05 Mb, respectively (Tables 1 and S5). The number of chromosomes of Chinese alligator is reported to be 2n = 32 (Zeng et al., 2011). Here, scaffolds totaling 2.26 Gb (98.26% of our assembly) were anchored into 16 chromosomes (length: 32 to 307 Mb) (Figure 1a) consistent with the karyotype study, indicating the assemble accuracy of our genome on the chromosome‐scale. BUSCO analysis showed that 96.6% of 2586 BUSCO genes (database: vertebrata_odb10) were identified, with 95.9% single and 0.7% duplicated copy. The remaining 2.4% and 1.0% were fragmented and missing. The GC content of this genome was 44.93% (Figure S2), which is very close to three other crocodiles (A. mississippiensis: 44.3%, GenBank ID: GCA_000281125.4; C. porosus: 43.85%, GenBank ID: GCA_000768395.2; G. gangeticus: 44.95%, GenBank ID: GCA_001723915.1) (Green et al., 2014). At last, 99.24, 99.47 and 91.70% of the Illumina short reads, Hi‐C reads and RNA‐seq data were successfully mapped onto our assembled genome, respectively.

TABLE 1.

Comparison of assembly statistics between the assembly in this study and the previous version (GenBank ID: GCF_000455745.1)

Parameters Current assembly GCF_000455745.1
Assembly approach WGS, PacBio and Hi‐C WGS
Sequence depth (X) 355 109
Contig N50 (Mb) 22.53 0.02
Scaffold N50 (Mb) 219.05 2.2
Genome size (Mb) 2296.17 2274.86
Predicted genes (n) 21,862 22,200

FIGURE 1.

FIGURE 1

Landscape of Chinese alligator genome. (a) Distribution of genome landscape: (1) population‐scale π‐values across 16 chromosomes; (2) density of SNP; (3) density of indels; (4) gene count; (5) read depth mapped to the genome; (6) GC content density. The statistics were calculated using a 500‐kbp window. (b) Divergence time of fifteen species generated by MCMCtree using the maximum likelihood method. The numbers in brackets show 95% confidence intervals of estimated divergence time between lineages.

Further, our assembly showed extremely high collinearity with the scaffold‐level genome released previously (GenBank ID: GCF_000455745.1, Figure S3) (Wan et al., 2013), and with an increase of 21.31 Mb of the assembled size and 1.8% of BUSCO score (Table S6). The statistics on contiguity showed an improved scaffold N50 of 99‐fold and contig N50 of 979‐fold when compared to the new assembly with the GCF_000455745.1 genome. In addition, our genome covered 99.59% of the short‐read‐based genome, and filled 114.85 Mb of gaps in it. Using comparative genomic analysis, our genome supported that the Chinese alligator and A. mississippiensis are sister groups that separated from each other c. 32 million years (My) BP (Figure 1b). Moreover, we confirm that the divergence between the common ancestor of birds and the crocodilian lineage occurred c. 240 My BP (Wan et al., 2013).

3.2. Improved genome annotation

We identified 814.74 Mb repetitive elements in our assembled Chinese alligator genome, representing 35.48% of the total genome (Table S7), which was comparable with three other crocodiles (Green et al., 2014). The most abundant repeat category was LTRs (20.10%), followed by LINEs (12.93%), DNA elements (5.74%) and SINEs (0.04%) (Figure S4 and Table S8). There also existed 8.27 Mb (0.36%) unknown repeat elements. We masked all these repeat sequences for genome annotation.

By combining evidence from de novo prediction, transcript mapping and homology‐based alignment, we predicted 21,862 high confident protein‐coding genes (Figure S5 and Table S9), which is generally consistent with gene numbers in the previous prediction (22,200 genes) (Wan et al., 2013). The average gene length, intron length and exon length were 36.25 kb, 4.47 kb and 173.79 bp (8.77 exons per gene), respectively (Figure S6 and Table S10). An obvious peak around 1000 bp was found in the gene length distribution of the GCF_000455745.1 and A. mississippiensis assemblies, but not in our chromosome‐scale assembly. Finally, 19,962 (91.31%) protein‐coding genes were functionally annotated in at least one of the four databases we used (see Section 2) (Figure S7 and Table S11), which was significantly more than the gene set count of GCF_000455745.1 (17,615 genes) (Wan et al., 2013). In addition, 2681 miRNA, 544 rRNA, 1545 tRNA and 1077 snRNA were predicted in our assembly (Table S12).

3.3. Population structure analysis

The average whole‐genome sequencing coverage and depth for the 23 individuals were 98.89% and 10.50‐fold, respectively (Table S13). We finally obtained 1,129,456 SNPs after filtering (see Section 2) for downstream analysis. PCA, admixture and phylogenetic tree analysis all supported that the CX, XC and ACA populations were assigned to three distinct clusters, which is consistent with geographical distribution (Figure 2a–d). The F3 statistics also supported a lack of admixture among the three Chinese alligator populations (Figure 2e). Pairwise F ST further revealed that they were significantly distinct from each other (F ST‐ACA‐CX = 0.23; F ST‐CX‐XC = 0.20; F ST‐ACA‐XC = 0.16). According to the rooted phylogenetic tree of the three populations with A. mississippiensis (Figure S8), XC is more likely the ancestor population, and the LD decay curve showed the fastest decrease of the r 2 value to its 50% in XC (Figure S9).

3.4. Population demographic history

Ancient demographic trajectories of three populations were highly similar with a continuous decline from 20 ka BP to the present day (Figures 3a,b and S10). This decline could be divided into three phases, including the first decline around 20–4 ka BP, a relatively stable state c. 4.0–1.5 ka BP, and the second obvious decline after 1.5 ka BP with an extremely low N e. We further inferred that the divergence among the three populations started c. 4.2 ka BP with a total separation within the past 1000 years (Figure 3b). The ACA and CX split c.1.0 ka BP, CX and XC c. 0.8 to 1.1 ka BP, ACA and XC c. 0.5 ka BP, according to the RCCR curves.

FIGURE 3.

FIGURE 3

Estimated demographic history and divergence time of the three Chinese alligator populations. (a) Thick coloured lines depict temporal fluctuations in effective size (N e) over the past 10–0.5 ka. The x‐axis corresponds to the time before present in years on a log scale, assuming a substitution rate (μ) of 0.79 × 10−8 substitutions/site/generation and a generation time (g) of 20 years (Green et al., 2014). The coloured rectangles depict several extreme climate events including the last glacial maximum (LGM), the Younger Dryas cold period (YD), MWP‐1A, 1B and the 8.2 ka BP cooling event (Lambeck et al., 2014) as well as 4.2 ka BP aridification event (Zhang, Cheng, et al., 2018). Thin coloured lines indicate the mean annual temperature (TANN) for the lower of the Yangtze region in the Holocene (Li et al., 2017). (b) Recent effective population size inferred by PopSizeABC. Dotted lines indicate a 90% confidence interval and grey rectangles depict the little ice age (LIA) (Jiang & Zhang, 2004). (c) Split times between pairwise populations calculated by MSMC2. The grey rectangle depicts the time of Song Dynasty coinciding with serious damage to the habitats of Chinese alligators (Barker, 2012).

3.5. Gene flow among the three Chinese alligator populations

Gene flow is essential to understand the degree of connectivity among populations, and to guide translocation and genetic rescue efforts for endangered species. We first performed the ABBA‐BABA test (A, B; X, Y), by setting A. mississippiensis as the outgroup. For most of the combinations, no significant deviations were found in the shared derived alleles between XC‐ACA and XC‐CX (D [CX,ACA; XC,AM] = 0.024, Z = 1.359), except for two individuals in XC (XC1, XC3), with significantly more shared derived alleles with ACA and CX, respectively (Figure 4a). TreeMix analysis detected gene flow between ACA and XC, which was somewhat consistent with the result of ABBA‐BABA test (Figure 4b). A detailed scanning of IBD fragments found that very long IBD fragments (within 5 generations) were absent among populations, although these can be found within CX (Figure 4c). However, shared IBD fragments increased from five generations (100 years) before present within and between populations (Figures 4d,e and S11).

FIGURE 4.

FIGURE 4

Gene flow between the three Chinese alligator populations. (a) Estimates of D (CX, ACA; XC, AM), in which only one positive and negative statistics, respectively, were considered statistically significant following correction for multiple testing, based on Z‐score > 3 and <−3. (b) Maximum‐likelihood tree with A. mississippiensis serving as an outgroup calculated by TreeMix. (c–e) Lengths of IBD (cM) shared between pairwise individuals within and between populations that were generated from 0–5 (c), 5–10 (d) and 10–15 (e) generations before the present.

3.6. Genetic diversity

Genome‐wide H in all samples showed extreme depletion with an average H = 1.20 * 10−4 ± 1.16 * 10−5, which is lower than the endangered Chinese crocodile lizard (Shinisaurus crocodilurus, 2.0 * 10−4 – 5.7 * 10−4) (Xie et al., 2021), five‐pacer viper (Deinagkistrodon acutus, 1 * 10−3) (Yin et al., 2016), Swinhoe's soft‐shelled turtle (Rafetus swinhoei, ~ 1.23 * 10−3), and many critically endangered species, and just slightly higher than brown eared pheasant (9.53 * 10−5) (Wang et al., 2021) and Iberian lynx (1.02 * 10−4) (Abascal et al., 2016) (Figure 5a, Table S14). H did not decrease significantly in exons compared to the whole genome. Three populations exhibited no significant difference except the genic and intron regions between ACA and CX (Figure 5b, Tables S15–S16). Genome‐wide π values were remarkably low at both species‐level (1.47 * 10−4) and population‐level: ACA (1.14 * 10−4) < CX (1.15 * 10−4) < XC (1.41 * 10−4) (Figure S12) when compared with crocodile lizard (3.85 * 10−4 ‐ 5.47 * 10−4) (Xie et al., 2021) and tuatara (Sphenodon punctatus) (8 * 10−4 – 1.1 * 10−3) (Gemmell et al., 2020). Several hotspot regions harboured genes associated with MHC class I protein binding, olfactory receptor activity, peptide hormone binding, and ubiquitin protein ligase activity (Tables S17–S18).

FIGURE 5.

FIGURE 5

Characterization of heterozygosity in Chinese alligators. (a) Comparison of genome‐wide heterozygosity among endangered or extinct species and three other crocodilians. Coloured dots represent each individual and whiskers represent the range in a given species. (b) H of different genomic regions in the three populations.

3.7. Inference of inbreeding history by ROHs

Small and endangered populations are usually threatened by inbreeding, which is reflected in ROHs. Overall, we discovered 20,938 ROH ranging from 0.10 Mb to 36.56 Mb in 23 individuals. The proportion of ROH in the whole genome (FROH) ranged from 46.72% to 69.63%: ACA (55.61 ± 3.49%) < CX (55.96 ± 7.60%) < XC (59.16 ± 6.46%) (Table S19), indicating a higher inbreeding level than crocodile lizards (20%–60%) (Xie et al., 2021), mountain and eastern lowland gorillas (34.5% and 38.4%) (Xue et al., 2015) but lower than brown‐eared pheasant populations (>80%) (Wang et al., 2021). Interestingly, the individual with the lowest FROH was the CX1 (46.72%), the wild founder of CX population. ROH longer than 2 Mb potentially indicates mating between closely‐related individuals and not drift alone (Ceballos et al., 2018). All alligator genomes in this study comprised a high proportion (38.88 ± 7.21%) of ROH >2 Mb. Even when considering ROH longer than 10 Mb, FROH was still high (11.76 ± 4.67%). Also, CX1 borne fewer extreme long ROHs than other individuals (Figure 6a).

FIGURE 6.

FIGURE 6

Investigation of inbreeding history by ROH in Chinese alligator. (a) Distribution of ROH in different length categories for each genome. (b) ROH longer than 5 Mb in the genome of three populations and shared ROH regions between pairwise populations and among three populations. (c) Average genome‐wide heterozygosity and ROH density in the three populations. The colour gradient is scaled according to the ROH density, as shown in the legend. (d) Heat map showing the proportion of genomes in ROH regions (top right) and both ROH and IBD (bottom left) regions between pairwise comparisons among individuals. (e) Distribution of ROH resulting from inbreeding in different generations.

In general, the distribution of ROH differed among populations (Figure S13). In total, 1.84 Gb ROH (>0.1 Mb) were found shared by all three populations and 3.92 Mb, 4.30 Mb and 24.42 Mb ROH (>0.1 Mb) were population‐specific in CX, ACA and XC. ROH regions shared by each combination of two populations were also high: FROH>5 Mb CX‐XC = 54.52%, FROH>5 Mb ACA‐XC = 48.29%, FROH>5 Mb ACA‐CX = 44.09% (Figure 6b). The distribution of ROH at the individual level was also investigated (Figures 6c and S14). Although the distribution of ROHs across the genome was various among different individuals, the proportion of shared regions between every two individuals within a certain population was high. Nevertheless, there existed a significant difference when pairwise comparison was conducted on shared ROH in IBD regions within and between populations (within: 12.21% ± 4.74%, between: 4.90% ± 1.73%; wilcox.test, p < 2.2e‐16) (Figure 6d).

To explore their inbreeding history, we calculated the expected time (in generation) of ROH generated from Figure 6e. FROH generated within 5 generations (>11.24 Mb) varied among all individuals, while the founder (CX1) had the least value. FROH of 5–10 generations (5.62–11.24 Mb) ago was higher than that within five generations (FROH>11.24 Mb). FROH from distant time periods before 10 generations fluctuated from 5% to 10%, showing a more stable style compared with 0–5 and 5–10 generations. The distribution of ROHs' expected time indicated that ROHs in Chinese alligator genomes have accumulated gradually since at least 2 ka BP.

3.8. Mutational load

When estimating mutational load in each sample, we firstly found 18.91 ± 3.39 and 12.17 ± 3.91 heterozygous dnsSNPs and LOFs, respectively, without significant differences in the three populations (Figure S15). As expected, there were fewer homozygous dnsSNPs (11.26 ± 4.39) and LOFs (10.09 ± 2.64) than heterozygous ones. However, homozygous dnsSNPs and LOFs were significantly different among populations, with XC presented the highest number (dnsSNP: 14.89 ± 1.52; LOF: 12.33 ± 1.63), compared to the ACA population (dnsSNP: 13.17 ± 2.27; LOF: 9.33 ± 2.36) and the CX population (dnsSNP: 5.75 ± 0.97; LOF: 8.13 ± 1.69) (Figures 7a and S16). After a further inspection of Rnhom, we found that homozygous missense mutations tended to distribute within ROH in sharp contrast to regions outside ROH, with a Rnhom value of 2.24 ± 1.98. Moreover, homozygous missense mutations in three CX individuals were all located within ROH (Figure S17).

FIGURE 7.

FIGURE 7

Mutational load in three Chinese alligator populations. (a) Comparison of the count of homozygous nonsynonymous mutations, including missense, LOF and dnsSNP mutations. (b) Venn diagram for nonsynonymous mutations. (c) Venn diagram for genes carrying nonsynonymous mutations. (d) GO enrichment of biological process for all genes in (c). (e) Counts of new dnsSNP and LOF mutations introduced by each individual if gene flow occurred.

Among all 1235 derived mutations including missense, LOF, and dnsSNP mutations, 38% of them were shared among three populations, whereas 3, 3 and 25% were specific to ACA, CX and XC, respectively (Figure 7b). As to the 613 genes harbouring these mutations, 44% were shared and 29% were population‐specific: CX(2%) < ACA(4%) < XC(23%) (Figure 7c). GO analysis of these 613 genes indicated that these genes are associated with several important biological processes, including cell growth and development (LRP1, MR1, SEMA4G, DBN1, PTPRS), cell morphogenesis (MYO10, ZNF135), bone development (KIT, TNF), lens fibre cell differentiation (SPRED2), reproduction (FSIP2, TEKT2, PLEKHA5, METTL3, MOV10L1), immunity (CD274, MR1) and nervous system (AKAP12, KNDC1). Population‐unique genes also involved similar functions while there were several cell‐cycle‐related genes in XC population, such as TOP2A, which was a classic proliferation marker (Table S20).

In the prediction of newly introduced deleterious mutations in cross‐breeding programmes, XC individuals would introduce the most counts to the other two populations, including 0–20 dnsSNPs and 0–13 LOFs (Figure 7e; Table S21), consistent with the fact that XC accounted for the highest proportion of population‐specific deleterious mutations. In pairwise comparison at an individual level, the number of shared deleterious alleles was unsurprisingly higher within a certain population while unique ones were more when the compared samples came from different populations (Figure S18).

3.9. Signatures putatively under local adaption

By applying iHS method, we identified 9077, 9520 and 12,766 SNPs under putatively recent positive selection in ACA, CX and XC, corresponding to 260, 224 and 442 genes, respectively. A large proportion of these genes were population‐specific: CX(16%) < ACA (19%) < XC (42%) (Figure S19). For positively selected genes unique to each population, however, GO enrichment did not show strong evidence for overrepresented categories that could be associated with local adaption (Table S22). Finally, PBS analysis showed 28, five and four genes with a signal of positive selection unique to the CX, ACA and XC, respectively (Figure S20). Further investigation of gene function revealed the selection effect on genes relative to immunity, possibly implying different immune genes essential for each population (Table S23).

4. DISCUSSION

Here, we present the first high‐quality chromosome‐level assembly and population genomic exploration of Crocodilia. Chinese alligator is the most threatened crocodile in the world, thus it is essential to investigate their genomic backgrounds for planing reasonable and scientific strategies for conservation (Supple & Shapiro, 2018).

4.1. An improved genome assembly and annotation

Although previous short‐read assembly revealed the possible genetic basis for its biological characteristics (Wan et al., 2013), only a much more improved genome could help to conduct accurate analyses on the genomic characteristics (i.e., ROH and IBD). Thanks to PacBio long‐read sequencing and Hi‐C technology, we assembled the most continuous, complete, and high‐quality reference genome for Chinese alligator so far. Compared with the previously released assembly (GenBank: GCF_000455745.1), the new assembly holds excellent advantages in scaffold N50, gene length, completeness of gene set and the number of annotated genes. This chromosome‐scale assembly and genome annotation would undoubtedly facilitate the assessment of genetic diversity, inbreeding status and mutational load and local adaptation, which are extremely important to isolate and long‐term declining Chinese alligator populations.

4.2. Population structure analysis revealed genetic diversity extinct in the wild

CX and XC are now the last two and largest populations in their original habitats, representing the sole remaining genetic diversity of Chinese alligator in China. A previous study reported that the breeding populations of XC and CX were genetically closely‐related when compared with the wild population from Xuancheng based on MHC class IIb gene analyses (Nie et al., 2013). Individuals in American sanctuaries are actually transferred from China around one century ago (Behler, 1993; Honegger & Hunt, 1990), while their geographical origin and genetic background are ambiguous. Here, we clarify for the first time the genetic relationships of the three populations, proposing that XC is more likely the ancestor population and ACA is distinct from CX and XC. This result is somewhat consistent with Nie et al. (2013), considering that the XC population originated from wild samples from Xuancheng.

Pairwise F ST among the three populations was smaller than that between two tuatara populations (Gemmell et al., 2020), comparable to that among three Chinese crocodile lizard populations in China (Xie et al., 2021), but larger than that between the Sichuan and Qinling giant panda subspecies (Guang et al., 2021), the African leopard populations (Pečnerová et al., 2021), and even the human populations of Africa and Asia (Altshuler et al., 2005), further supporting the extent of the geographical isolation of these three populations. We supposed that the split time of ACA with the two Chinese populations is much earlier than the time their founders left China because such a huge genetic difference was unlikely to evolve within one century. And the result of MSMC2 on the divergence was consistent with our thought. Therefore, the geographical origin of ACA ancestors is distinct from CX and XC and ACA may represent genetic diversity of a wild‐extinct population. Animals from ACA could be considered as potential donors for translocation programmes.

4.3. Causes of long‐term population declines

Our demographic reconstruction for the three populations showed highly consistent trends with a general decline for the recent 20 ka (Figure 3a,b). During a long period before the Song Dynasty, vast areas in Yangtze River region were not exploited by humans (Wen, 2000). The N e decline starting c. 20 ka BP may be caused by the cold climate of LGM (Lambeck et al., 2014), possibly reflecting a drastic reduction in population size. Subsequently, N e kept stable after the 4.2 ka BP aridification event (Zhang et al., 2018) while population divergence occurred at this time point. The serious drought event could constrict the wetland and river systems area, thus partly building geographical barriers and hampering individual migration and gene flow, which would have led to a further decrease in N e. During the Southern Song Dynasty (between 1.5–1.3 ka BP), the economic gravity centre shifted from the Yellow River Basin to the Yangtze River Basin (Wang et al., 2021). This shift was accompanied by deforestation, hunting, intense farming with the introduction of new strains of rice and improved methods of water control and irrigation (Wen, 2000). A destructive anthropogenic disturbance may be important factor in the contraction of Chinese alligator populations from 1.5 ka BP onwards, thus providing possible reasons for the population isolation (Figure 3c). Contemporary N e is still decreasing, at least partly implying a genomic consequence of the declining census population size in recent years (Ding et al., 2001; Ding & Wang, 2004), raising concerns for the long‐term survival of wild populations.

4.4. Genomic consequences of continuous declining

Severe population declines are likely to lead to a loss of genetic diversity, an increase in inbreeding, exposure of deleterious alleles in homozygous state and strong drift which will lead to fixation of deleterious alleles. The most direct consequence of the long‐term decline in Chinese alligators is the deficiency in genomic heterozygosity, which is lower than most endangered animals. However, heterozygosity in CDS regions was slightly lower compared to intron regions, which has also been observed in A. mississippiensis and G. gangeticus, in stark contrast with other representative species like chicken (G. gallus) and green anole (A. carolinensis) (Green et al., 2014; Wan et al., 2013). This may indicate that maintaining the current remaining low polymorphism in CDS was an evolutionary strategy to keep fitness in struggling with reducing genetic diversity in some fragile species (i.e., narwhal) (Westbury et al., 2019).

Long ROHs that have not been broken by recombination are probably the result of recent inbreeding (Mcquillan et al., 2008). Between 19.80%–48.87% of the total ROHs seem to have been generated within the recent 10 generations, while the 100 past generations ranged from 86.93%–92.04%. This result indicates that although three populations have been continuously declining for 20 ka, frequent inbreeding mostly occurred within the past 200 years. We inferred that not only the recent founder effect but also long‐term small population size contributed to their high inbreeding level of them. Interestingly, the FROH generated in the past five generations was lower than that of 5–10 generations (they were generated before breeding programmes [generation time in breeding programmes: 7 years]), indicating that inbreeding level in the wild populations during the 20th century may be comparable with that in breeding centres. Shared ROHs in IBD regions represent four identical haplotypes of the two samples. Optimistically, this index was significantly different when comparing pairwise individuals within and between populations, providing potential possibilities to improve their genetic status by gene flow among populations.

Mutational load provides an important way of assessing the exposure to genetic threats in small populations (von Seth et al., 2021). Significant differences were observed in the number of homozygous other than heterozygous deleterious mutations. Furthermore, inbreeding led to the accumulation of homozygous missense alleles, contrary to the result in the brown‐eared pheasant populations (Wang et al., 2021). Several deleterious alleles were located in genes related to osteoclast differentiation, cell development and growth, organ morphogenesis, lens fiber cell differentiation and retinoic acid receptor signalling pathway, which may be candidate SNPs responsible for congenital malformations (Figures S21–S22) (Wu et al., 1999). Yet, further genetic investigation and experimental verification are needed.

4.5. Implications for conservation and genetic rescue attempts

Although species with small N e and low genetic diversity can still survive for thousands of generations (Robinson et al., 2016; Wang et al., 2021), genetic effects can severely reduce the fitness of species and hamper demographic recovery because of the limited adaptive potential in small populations (Willi et al., 2006). Assisted gene flow is regarded as a vital method for the recovery of endangered populations in conservation genetics. From 2001–2006, some individuals from XC and ACA were translocated to Changxing Centre (Ni, 2012). Nevertheless, weighing the risks (i.e., outbreeding depression) and benefits (i.e., genetic rescue) of translocations is essential in such a cross‐breeding programme.

We first evaluated the likelihood of introducing new deleterious alleles when selecting different individuals as donors to move to the recipient population. Due to the existence of population‐unique dnsSNPs and LOFs (CX < ACA < XC), a risk of increasing mutational load seems unavoidable, especially when introducing XC individuals into CX population, while it could be alleviated by choosing different donors. Overall, the counts of carried‐over deleterious alleles were lower than that in rhinoceros populations (von Seth et al., 2021). Second, we examined signatures of positive selection to identify potential signatures of adaptation. Even though there was a large genetic distinction among populations, an obvious indication of local adaption was not observed in each population. This result could be explained by recent geographical isolation and similar habitats.

5. CONCLUSIONS

Investigation of the genetic background of endangered species is essential for their protection and conservation. Here, we assembled the first chromosome‐scale genome of the Chinese alligator and applied whole genome data to examine the genomic consequences of severe declines in this critically endangered species. Extensive investigation across their genomes verified their highly endangered status from a population genetic perspective. Furthermore, this study highlights the need of integrating genetic indices into IUCN classification lists (Hoban et al., 2020). Finally, our investigation of the genetic background of the Chinese alligator populations is a valuable resource and provided recommendations for future conservation and management.

AUTHOR CONTRIBUTIONS

Qiu‐hong Wan and Sheng‐Guo Fang conceived and initiated the project. Mengyuan Hu organized and collected the samples. Jun Cao, Lirong Liu and Jianqing Lin performed DNA library preparation and sequencing. Yi Zhang and Qing Wang assembled the improved genome and conducted comparative genomics analysis. Shangchen Yang, Tianming Lan, Haimeng Li, Minhui Shi and Yixin Zhu performed population genetic analysis. Shangchen Yang wrote the manuscript. Tianming Lan, Sunil Kumar Sahu and Nicolas Dussex coordinated the data analysis and extensively revised the manuscript. Qiu‐hong Wan, Huan Liu, and Sheng‐Guo Fang provided supervision. All authors read and approved the final manuscript.

CONFLICT OF INTEREST

The authors declare no conflict of financial interests.

Supporting information

Appendix S1

ACKNOWLEDGEMENTS

We thank the Anhui Research Centre of Chinese Alligator Reproduction (ARCCAR) and Changxing Yinjiabian Chinese Alligator Nature Reserve (CYCANR) for collection of samples, BGI for bioinformatics support and the computing facility at the China National GeneBank. We are grateful to Love Dalén, the professor of evolutionary genetics at the Centre for Palaeogenetics in Stockholm, Sweden, for professional suggestions on this article. This work was supported by the Fundamental Research Funds for the Central Universities of China and Natural Science Foundation of Zhejiang Province, China (LQ21C030008). This work was also supported by the China National GeneBank and the Guangdong Academy of Forestry. Our project was financially supported by funding from the Guangdong Provincial Key Laboratory of Genome Read and Write (grant no. 2017B030301011). Finally, we are thankful to the China National GeneBank for producing the sequencing data and to the Guangdong Provincial Academician Workstation of BGI Synthetic Genomics (no. 2017B090904014).

Yang, S. , Lan, T. , Zhang, Y. , Wang, Q. , Li, H. , Dussex, N. , Sahu, S. K. , Shi, M. , Hu, M. , Zhu, Y. , Cao, J. , Liu, L. , Lin, J. , Wan, Q.‐H. , Liu, H. , & Fang, S.‐G. (2023). Genomic investigation of the Chinese alligator reveals wild‐extinct genetic diversity and genomic consequences of their continuous decline. Molecular Ecology Resources, 23, 294–311. 10.1111/1755-0998.13702

Shangchen Yang, Tianming Lan, Yi Zhang and Qing Wang contributed equally to this work.

Handling Editor: Joanna Kelley

Contributor Information

Qiu‐Hong Wan, Email: qiuhongwan@zju.edu.cn.

Huan Liu, Email: liuhuan@genomics.cn.

Sheng‐Guo Fang, Email: sgfanglab@zju.edu.cn.

DATA AVAILABILITY STATEMENT

The data that support the findings of this study have been deposited into CNGB Sequence Archive (CNSA) (Guo et al., 2020) of the China National GeneBank DataBase (CNGBdb) (Chen et al., 2020) with accession number CNP0002575. The mRNA‐Seq and sRNA‐Seq data of Chinese alligator generated in our previous work were used for genome annotation, which had been deposited to NCBI SRA database under BioProject accession numbers PRJNA556093, and PRJNA556092, respectively.

REFERENCES

  1. Abascal, F. , Corvelo, A. , Cruz, F. , Villanueva‐Cañas, J. L. , Vlasova, A. , Marcet‐Houben, M. , Martínez‐Cruz, B. , Cheng, J. Y. , Prieto, P. , Quesada, V. , Quilez, J. , Li, G. , García, F. , Rubio‐Camarillo, M. , Frias, L. , Ribeca, P. , Capella‐Gutiérrez, S. , Rodríguez, J. M. , Câmara, F. , … Godoy, J. A. (2016). Extreme genomic erosion after recurrent demographic bottlenecks in the highly endangered Iberian lynx. Genome Biology, 17(1), 251. 10.1186/s13059-016-1090-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alexander, D. H. , Novembre, J. , & Lange, K. (2009). Fast model‐based estimation of ancestry in unrelated individuals. Genome Research, 19(9), 1655–1664. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Altschul, S. F. , Gish, W. , Miller, W. , Myers, E. W. , & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403–410. [DOI] [PubMed] [Google Scholar]
  4. Altshuler, D. , Donnelly, P. , & Consortium, I. H. (2005). A haplotype map of the human genome. Nature, 437(7063), nature04226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Apweiler, R. , Attwood, T. K. , Bairoch, A. , Bateman, A. , Birney, E. , Biswas, M. , Bucher, P. , Cerutti, L. , Corpet, F. , Croning, M. D. , Durbin, R. , Falquet, L. , Fleischmann, W. , Gouzy, J. , Hermjakob, H. , Hulo, N. , Jonassen, I. , Kahn, D. , Kanapin, A. , … Zdobnov, E. M. (2001). The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Research, 29(1), 37–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Barker, R. (2012). The origin and spread of early‐ripening champa rice: It's impact on song dynasty China. Rice, 4(3–4), 184–186. 10.1007/s12284-011-9079-6 [DOI] [Google Scholar]
  7. Behler, J. (1993). Species survival plan for chinese alligator. Crocodile Specialist Group Newsletter, 12(4), 18. [Google Scholar]
  8. Benson, G. (1999). Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Research, 27(2), 573–580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Birney, E. , Clamp, M. , & Durbin, R. (2004). GeneWise and genomewise. Genome Research, 14(5), 988–995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Boitard, S. , Rodriguez, W. , Jay, F. , Mona, S. , & Austerlitz, F. (2016). Inferring population size history from large samples of genome‐wide molecular data – An approximate bayesian computation approach. PLoS Genetics, 12(3), e1005877. 10.1371/journal.pgen.1005877 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bolger, A. M. , Lohse, M. , & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics, 30(15), 2114–2120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Browning, B. L. , & Browning, S. R. (2013). Improving the accuracy and efficiency of identity‐by‐descent detection in population data. Genetics, 194(2), 459–471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Browning, B. L. , Zhou, Y. , & Browning, S. R. (2018). A one‐penny imputed genome from next‐generation reference panels. The American Journal of Human Genetics, 103(3), 338–348. 10.1016/j.ajhg.2018.07.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Burge, C. , & Karlin, S. (1997). Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology, 268(1), 78–94. [DOI] [PubMed] [Google Scholar]
  15. Campbell, M. S. , Holt, C. , Moore, B. , & Yandell, M. (2014). Genome annotation and curation using MAKER and MAKER‐P. Current Protocols in Bioinformatics, 48, 4.11.1–4.11.39. 10.1002/0471250953.bi0411s48 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Ceballos, F. C. , Joshi, P. K. , Clark, D. W. , Ramsay, M. , & Wilson, J. F. (2018). Runs of homozygosity: windows into population history and trait architecture. Nature Reviews Genetics, 19(4), 220–234. 10.1038/nrg.2017.109 [DOI] [PubMed] [Google Scholar]
  17. Chang, C. C. , Chow, C. C. , Tellier, L. C. , Vattikuti, S. , Purcell, S. M. , & Lee, J. J. (2015). Second‐generation PLINK: rising to the challenge of larger and richer datasets. Gigascience, 4, 7. 10.1186/s13742-015-0047-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Chen, F. Z. , You, L. J. , Yang, F. , Wang, L. N. , Guo, X. Q. , Gao, F. , Hua, C. , Tan, C. , Fang, L. , Shan, R. Q. , Zeng, W. J. , Wang, B. , Wang, R. , Xu, X. , & Wei, X. F. (2020). CNGBdb: China National GeneBank DataBase. Yi chuan yu yu zhong, 42(8), 799–809. 10.16288/j.yczz.20-080 [DOI] [PubMed] [Google Scholar]
  19. Chin, C.‐S. , Alexander, D. H. , Marks, P. , Klammer, A. A. , Drake, J. , Heiner, C. , Clum, A. , Copeland, A. , Huddleston, J. , Eichler, E. E. , Turner, S. W. , & Korlach, J. (2013). Nonhybrid, finished microbial genome assemblies from long‐read SMRT sequencing data. Nature Methods, 10(6), 563–569. 10.1038/nmeth.2474 [DOI] [PubMed] [Google Scholar]
  20. Chin, C.‐S. , Peluso, P. , Sedlazeck, F. J. , Nattestad, M. , Concepcion, G. T. , Clum, A. , Dunn, C. , O'Malley, R. , Figueroa‐Balderas, R. , Morales‐Cruz, A. , Cramer, G. R. , Delledonne, M. , Luo, C. , Ecker, J. R. , Cantu, D. , Rank, D. R. , & Schatz, M. C. (2016). Phased diploid genome assembly with single‐molecule real‐time sequencing. Nature Methods, 13(12), 1050–1054. 10.1038/nmeth.4035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Cho, Y. S. , Hu, L. , Hou, H. , Lee, H. , Xu, J. , Kwon, S. , Oh, S. , Kim, H. M. , Jho, S. , Kim, S. , Shin, Y. A. , Kim, B. C. , Kim, H. , Kim, C. U. , Luo, S. J. , Johnson, W. E. , Koepfli, K. P. , Schmidt‐Küntzel, A. , Turner, J. A. , … Bhak, J. (2013). The tiger genome and comparative analysis with lion and snow leopard genomes. Nature Communications, 4, 2433. 10.1038/ncomms3433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Cingolani, P. , Platts, A. , Wang, L. L. , Coon, M. , Nguyen, T. , Wang, L. , … Ruden, D. M. (2012). A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly, 6(2), 80–92. 10.4161/fly.19695 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Danecek, P. , Auton, A. , Abecasis, G. , Albers, C. A. , Banks, E. , DePristo, M. A. , Handsaker, R. E. , Lunter, G. , Marth, G. T. , Sherry, S. T. , McVean, G. , Durbin, R. , & 1000 Genomes Project Analysis Group . (2011). The variant call format and VCFtools. Bioinformatics, 27(15), 2156–2158. 10.1093/bioinformatics/btr330 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Depristo, M. A. , Banks, E. , Poplin, R. , Garimella, K. V. , & Daly, M. J. (2011). A framework for variation discovery and genotyping using next‐generation DNA sequencing data. Nature Genetics, 43(5), 491–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Ding, Y. , & Wang, X. (2004). Factors influencing the population status of wild Chinese alligators Alligator sinensis. Biodiversity Science, 12(3), 324–332. [Google Scholar]
  26. Ding, Y. Z. , Wang, X. M. , He, L. J. , Xie, W. S. , Thorbjarnarson, B. J. , & McMurry, T. S. (2001). Study on the current population and habitat of the wild Chinese alligator (Alligator sinensis). Biodiversity Science, 9(2), 102–108. [Google Scholar]
  27. Durand, N. C. , Shamim, M. S. , Machol, I. , Rao, S. S. , Huntley, M. H. , Lander, E. S. , & Aiden, E. L. (2016). Juicer provides a one‐click system for analyzing loop‐resolution Hi‐C experiments. Cell Systems, 3(1), 95–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Feng, S. , Fang, Q. , Barnett, R. , Li, C. , Han, S. , Kuhlwilm, M. , … Zhang, G. (2019). The Genomic Footprints of the Fall and Recovery of the Crested Ibis. Current Biology, 29, 340–349. 10.1016/j.cub.2018.12.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Flynn, J. M. , Hubley, R. , Goubert, C. , Rosen, J. , Clark, A. G. , Feschotte, C. , & Smit, A. F. (2020). RepeatModeler2 for automated genomic discovery of transposable element families. Proceedings of the National Academy of Sciences, 117(17), 9451–9457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Foote, A. D. , Martin, M. D. , Louis, M. , Pacheco, G. , Robertson, K. M. , Sinding, M. H. S. , Amaral, A. R. , Baird, R. W. , Baker, C. S. , Ballance, L. , Barlow, J. , Brownlow, A. , Collins, T. , Constantine, R. , Dabin, W. , Dalla Rosa, L. , Davison, N. J. , Durban, J. W. , Esteban, R. , … Morin, P. A. (2019). Killer whale genomes reveal a complex history of recurrent admixture and vicariance. Molecular Ecology, 28(14), 3427–3444. [DOI] [PubMed] [Google Scholar]
  31. Frankham, R. (2005). Genetics and extinction. Biological Conservation, 126, 131–140. [Google Scholar]
  32. Gemmell, N. J. , Rutherford, K. , Prost, S. , Tollis, M. , Winter, D. , Macey, J. R. , Adelson, D. L. , Suh, A. , Bertozzi, T. , Grau, J. H. , Organ, C. , Gardner, P. P. , Muffato, M. , Patricio, M. , Billis, K. , Martin, F. J. , Flicek, P. , Petersen, B. , Kang, L. , … Ngatiwai Trust Board . (2020). The tuatara genome reveals ancient features of amniote evolution. Nature, 584(7821), 403–409. 10.1038/s41586-020-2561-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Grantham, R. (1974). Amino acid difference formula to help explain protein evolution. Science, 185, 862–864. [DOI] [PubMed] [Google Scholar]
  34. Green, R. E. , Braun, E. L. , Armstrong, J. , Earl, D. , Nguyen, N. , Hickey, G. , Vandewege, M. W. , St John, J. A. , Capella‐Gutiérrez, S. , Castoe, T. A. , Kern, C. , Fujita, M. K. , Opazo, J. C. , Jurka, J. , Kojima, K. K. , Caballero, J. , Hubley, R. M. , Smit, A. F. , Platt, R. N. , … Ray, D. A. (2014). Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs. Science, 346(6215), 1254449. 10.1126/science.1254449 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Guang, X. , Lan, T. , Wan, Q.‐H. , Huang, Y. , Li, H. , Zhang, M. , … Zhang, L. (2021). Chromosome‐scale genomes provide new insights into subspecies divergence and evolutionary characteristics of the giant panda. Science Bulletin, 66, 2002–2013. [DOI] [PubMed] [Google Scholar]
  36. Guo, X. , Chen, F. , Gao, F. , Li, L. , Liu, K. , You, L. , Hua, C. , Yang, F. , Liu, W. , Peng, C. , Wang, L. , Yang, X. , Zhou, F. , Tong, J. , Cai, J. , Li, Z. , Wan, B. , Zhang, L. , Yang, T. , … Xu, X. (2020). CNSA: a data repository for archiving omics data. Database: The Journal of Biological Databases and Curation, 2020, 1–6. 10.1093/database/baaa055 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Haas, B. J. , Papanicolaou, A. , Yassour, M. , Grabherr, M. , Blood, P. D. , Bowden, J. , Couger, M. B. , Eccles, D. , Li, B. , Lieber, M. , MacManes, M. , Ott, M. , Orvis, J. , Pochet, N. , Strozzi, F. , Weeks, N. , Westerman, R. , William, T. , Dewey, C. N. , … Regev, A. (2013). De novo transcript sequence reconstruction from RNA‐seq using the Trinity platform for reference generation and analysis. Nature Protocols, 8(8), 1494–1512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Haas, B. J. , Salzberg, S. L. , Zhu, W. , Pertea, M. , Allen, J. E. , Orvis, J. , … Wortman, J. R. (2008). Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biology, 9(1), 1–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Haddad, N. M. , Brudvig, L. A. , Clobert, J. , Davies, K. F. , Gonzalez, A. , Holt, R. D. , … Collins, C. D. J. S. A. (2015). Habitat fragmentation and its lasting impact on Earth's ecosystems, 1(2), e1500052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Hedrick, P. W. , & Garcia‐Dorado, A. (2016). Understanding inbreeding depression, purging, and genetic rescue. Trends in Ecology & Evolution, 31, 940–952. 10.1016/j.tree.2016.09.005 [DOI] [PubMed] [Google Scholar]
  41. Hoban, S. , Bruford, M. , D'Urban Jackson, J. , Lopes‐Fernandes, M. , Heuertz, M. , Hohenlohe, P. A. , … Laikre, L. (2020). Genetic diversity targets and indicators in the CBD post‐2020 Global Biodiversity Framework must be improved. Biological Conservation, 248, 108654. 10.1016/j.biocon.2020.108654 [DOI] [Google Scholar]
  42. Honegger, R. E. , & Hunt, R. H. (1990). Breeding crocodiles in zoological gardens outside the species range, with some data on the general situations in European zoos, 1989. In Crocodiles. Proc. 10th Working Meeting of the IUCN/SSC Crocodile Specialist Group, Gainesville, Florida (Vol. 1, pp. 200–228). IUCN. [Google Scholar]
  43. Jiang, H. , & Wu, X. B. (2018). Alligator sinensis. The IUCN Red List of Threatened Species, e.T867A3146005. RLTS.T867A3146005.en. 10.2305/IUCN.UK.2018-1 [DOI]
  44. Jiang, T. , & Zhang, Q. (2004). Climatic changes driving on floods in the Yangtze Delta, China during 1000–2002. Cybergeo, 296. [Google Scholar]
  45. Jurka, J. , Kapitonov, V. V. , Pavlicek, A. , Klonowski, P. , Kohany, O. , & Walichiewicz, J. (2005). Repbase Update, a database of eukaryotic repetitive elements. Cytogenetic and Genome Research, 110(1–4), 462–467. [DOI] [PubMed] [Google Scholar]
  46. Kanehisa, M. , & Goto, S. (2000). KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28(1), 27–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Kardos, M. , Åkesson, M. , Fountain, T. , Flagstad, O. , Liberg, O. , Olason, P. , Sand, H. , Wabakken, P. , Wikenros, C. , & Ellegren, H. (2018). Genomic consequences of intensive inbreeding in an isolated wolf population. Nature Ecology & Evolution, 2(1), 124–131. 10.1038/s41559-017-0375-4 [DOI] [PubMed] [Google Scholar]
  48. Kelle, L. F. , & Waller, D. M. (2002). Inbreeding effects in wild populations. Trends in Ecology & Evolution, 17(5), 230–241. [Google Scholar]
  49. Keller, O. , Kollmar, M. , Stanke, M. , & Waack, S. (2011). A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics, 27(6), 757–763. [DOI] [PubMed] [Google Scholar]
  50. Khan, A. , Patel, K. , Shukla, H. , Viswanathan, A. , van der Valk, T. , Borthakur, U. , … Kardos, M. (2021). Genomic evidence for inbreeding depression and purging of deleterious genetic variation in Indian tigers. Proceedings of the National Academy of Sciences, 118(49), e2023018118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Kielbasa, S. M. , Wan, R. , Sato, K. , Horton, P. , & Frith, M. C. (2011). Adaptive seeds tame genomic sequence comparison. Genome Research, 21, 487–493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Korf, I. (2004). Gene finding in novel genomes. BMC Bioinformatics, 5(1), 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Krzywinski, M. , Schein, J. , Birol, I. , Connors, J. , Gascoyne, R. , Horsman, D. , Jones, S. J. , & Marra, M. A. (2009). Circos: An information aesthetic for comparative genomics. Genome Research, 19, 1639–1645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Kyriazis, C. C. , Wayne, R. K. , & Lohmueller, K. E. (2020). Strongly deleterious mutations are a primary determinant of extinction risk due to inbreeding depression. Evolution Letters, 5, 33–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Lambeck, K. , Rouby, H. , Purcell, A. , Sun, Y. , & Sambridge, M. (2014). Sea level and global ice volumes from the Last Glacial Maximum to the Holocene. Proceedings of the National Academy of Sciences, 111(43), 15296–15303. 10.1073/pnas.1411762111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Lam‐Tung, N. , Schmidt, H. A. , Arndt, V. H. , Quang, M. B. , & Evolution . (2015). IQ‐TREE: A fast and effective stochastic algorithm for estimating maximum‐likelihood phylogenies. Molecular Biology and Evolution, 32(1), 268–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Lander, E. S. , & Waterman, M. S. (1988). Genomic mapping by fingerprinting random clones: a mathematical analysis. Genomics, 2(3), 231–239. 10.1016/0888-7543(88)90007-9 [DOI] [PubMed] [Google Scholar]
  58. Li, H. , & Durbin, R. (2010). Fast and accurate long‐read alignment with Burrows–Wheeler transform. Bioinformatics, 26(5), 589–595. 10.1093/bioinformatics/btp698 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Li, H. , Handsaker, B. , Wysoker, A. , Fennell, T. , Ruan, J. , Homer, N. , Marth, G. , Abecasis, G. , Durbin, R. , & 1000 Genome Project Data Processing Subgroup . (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078–2079. 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Li, J. , Dodson, J. , Yan, H. , Wang, W. , Innes, J. B. , Zong, Y. , Zhang, X. , Xu, Q. , Ni, J. , & Lu, F. (2018). Quantitative holocene climatic reconstructions for the lower Yangtze region of China. Climate Dynamics, 50(3–4), 1101–1113. 10.1007/s00382-017-3664-3 [DOI] [Google Scholar]
  61. Li, L. , Stoeckert, C. J., Jr. , & Roos, D. S. (2003). OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Research, 13(9), 2178–2189. 10.1101/gr.1224503 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Li, W. H. , Wu, C. I. , & Luo, C. C. (1984). Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications. Journal of Molecular Evolution, 21(1), 58–71. [DOI] [PubMed] [Google Scholar]
  63. Li, Z. , Chen, Y. , Mu, D. , Yuan, J. , Shi, Y. , Zhang, H. , … Fan, W. (2011). Comparison of the two major classes of assembly algorithms: Overlap–layout–consensus and de‐bruijn‐graph. Briefings in Functional Genomics, 11(1), 25–37. 10.1093/bfgp/elr035 [DOI] [PubMed] [Google Scholar]
  64. Lieberman‐Aiden, E. , van Berkum, N. , Williams, L. , Imakaev, M. , Ragoczy, T. , Telling, A. , Amit, I. , Lajoie, B. R. , Sabo, P. J. , Dorschner, M. O. , Sandstrom, R. , Bernstein, B. , Bender, M. A. , Groudine, M. , Gnirke, A. , Stamatoyannopoulos, J. , Mirny, L. A. , Lander, E. S. , & Dekker, J. (2009). Comprehensive mapping of long‐range interactions reveals folding principles of the human genome. Science, 326(5950), 289–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Luo, R. , Liu, B. , Xie, Y. , Li, Z. , Huang, W. , Yuan, J. , He, G. , Chen, Y. , Pan, Q. , Liu, Y. , Tang, J. , Wu, G. , Zhang, H. , Shi, Y. , Liu, Y. , Yu, C. , Wang, B. , Lu, Y. , Han, C. , … Wang, J. (2012). SOAPdenovo2: an empirically improved memory‐efficient short‐read de novo assembler. Gigascience, 1(1), 2047‐217X‐1‐18. 10.1186/2047-217x-1-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Majoros, W. H. , Pertea, M. , & Salzberg, S. L. (2004). TigrScan and GlimmerHMM: Two open source ab initio eukaryotic gene‐finders. Bioinformatics, 20(16), 2878–2879. [DOI] [PubMed] [Google Scholar]
  67. McQuillan, R. , Leutenegger, A. L. , Abdel‐Rahman, R. , Franklin, C. S. , Pericic, M. , Barac‐Lauc, L. , Smolej‐Narancic, N. , Janicijevic, B. , Polasek, O. , Tenesa, A. , Macleod, A. K. , Farrington, S. M. , Rudan, P. , Hayward, C. , Vitart, V. , Rudan, I. , Wild, S. H. , Dunlop, M. G. , Wright, A. F. , … Wilson, J. F. (2008). Runs of homozygosity in European populations. The American Journal of Human Genetics, 83, 359–372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Miles, L. G. , Isberg, S. R. , Glenn, T. C. , Lance, S. L. , Dalzell, P. , Thomson, P. C. , & Moran, C. (2009). A genetic linkage map for the saltwater crocodile (Crocodylus porosus). BMC Genomics, 10, 339. 10.1186/1471-2164-10-339 [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Ni, X. W. (2012). Pedigree construction of Zhejiang ex‐situ conservation population and screening founder population for reintroduction in Chinese alligator. [Google Scholar]
  70. Nie, C. , Zhao, J. , Li, Y. , & Wu, X. (2013). Diversity and selection of MHC class IIb gene exon3 in Chinese alligator. Molecular Biology Reports, 40(1), 295–301. 10.1007/s11033-012-2061-6 [DOI] [PubMed] [Google Scholar]
  71. Patterson, N. , Moorjani, P. , Luo, Y. , Mallick, S. , Rohland, N. , Zhan, Y. , Genschoreck, T. , Webster, T. , & Reich, D. (2012). Ancient admixture in human history. Genetics, 192(3), 1065–1093. 10.1534/genetics.112.145037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Pečnerová, P. , Garcia‐Erill, G. , Liu, X. , Nursyifa, C. , Waples, R. K. , Santander, C. G. , Quinn, L. , Frandsen, P. , Meisner, J. , Stæger, F. F. , Rasmussen, M. S. , Brüniche‐Olsen, A. , Hviid Friis Jørgensen, C. , da Fonseca, R. R. , Siegismund, H. R. , Albrechtsen, A. , Heller, R. , Moltke, I. , & Hanghøj, K. (2021). High genetic diversity and low differentiation reflect the ecological versatility of the African leopard. Current Biology, 31(9), 1862‐1871. e1865. [DOI] [PubMed] [Google Scholar]
  73. Pickrell, J. K. , & Pritchard, J. K. (2012). Inference of population splits and mixtures from genome‐wide allele frequency data. PLoS Genetics, 8(11), e1002967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. R Development Core Team (2012). R: A language and environment for statistical computing. R Foundation for Statistical Computing. [Google Scholar]
  75. Robinson, J. A. , Ortega‐Del Vecchyo, D. , Fan, Z. , Kim, B. Y. , von Holdt, B. M. , Marsden, C. D. , … Wayne, R. K. (2016). Genomic flatlining in the endangered island fox. Current Biology, 26, 1183–1189. 10.1016/j.cub.2016.02.062 [DOI] [PubMed] [Google Scholar]
  76. Robinson, J. A. , Räikkönen, J. , Vucetich, L. M. , Vucetich, J. A. , Peterson, R. O. , Lohmueller, K. E. , & Wayne, R. K. (2019). Genomic signatures of extensive inbreeding in Isle Royale wolves, a population on the threshold of extinction. Science Advances, 5(5), eaau0757. 10.1126/sciadv.aau0757 [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Ross, J. M. , Verdade, L. M. , Ross, J. P. , & Ross, J. (1998). Crocodiles. Status survey and conservation action plan (2nd ed.). IUCN/SSC Crocodile Specialist Group. [Google Scholar]
  78. Sambrook, J. , Fritsch, E. F. , & Maniatis, T. (1989). Molecular cloning: A laboratory manual. Cold Spring Harbor Laboratory Press. [Google Scholar]
  79. Saremi, N. F. , Supple, M. A. , Byrne, A. , Cahill, J. A. , Coutinho, L. L. , Dalén, L. , Figueiró, H. V. , Johnson, W. E. , Milne, H. J. , O'Brien, S. J. , O'Connell, B. , Onorato, D. P. , Riley, S. P. D. , Sikich, J. A. , Stahler, D. R. , Villela, P. M. S. , Vollmers, C. , Wayne, R. K. , Eizirik, E. , … Shapiro, B. (2019). Puma genomes from North and South America provide insights into the genomic consequences of inbreeding. Nature Communications, 10, 4769. 10.1038/s41467-019-12741-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Schiffels, S. , & Durbin, R. (2014). Inferring human population size and separation history from multiple genome sequences. Nature Genetics, 46(8), 919–927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Stapley, J. , Feulner, P. G. D. , Johnston, S. E. , Santure, A. W. , & Smadja, C. M. (2017). Variation in recombination frequency and distribution across eukaryotes: patterns and processes. Philosophical Transactions of the Royal Society B Biological Sciences, 372(1736), 20160455. 10.1098/rstb.2016.0455 [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Supple, M. A. , & Shapiro, B. (2018). Conservation of biodiversity in the genomics era. Genome Biology, 19(1), 131. 10.1186/s13059-018-1520-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Tarailo‐Graovac, M. , & Chen, N. (2009). Using RepeatMasker to identify repetitive elements in genomic sequences. Current Protocols in Bioinformatics, 25(1), 4.10.11–14.10.14. [DOI] [PubMed] [Google Scholar]
  84. Thompson, E. A. (2013). Identity by descent: variation in meiosis, across genomes, and in populations. Genetics, 194, 301–326. 10.1534/genetics.112.148825 [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Thorbjarnarson, J. , Wang, X. M. , Ming, S. , He, L. J. , Ding, Y. Z. , Wu, Y. L. , & McMurry, S. T. (2002). Wild populations of the Chinese alligator approach extinction. Biological Conservation, 103, 93–102. [Google Scholar]
  86. Voight, B. F. , Kudaravalli, S. , Wen, X. , & Pritchard, J. K. (2006). A map of recent positive selection in the human genome. PLoS Biology, 4(3), e72. 10.1371/journal.pbio.0040072 [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. von Seth, J. , Dussex, N. , Díez‐del‐Molino, D. , van der Valk, T. , Kutschera, V. E. , Kierczak, M. , Steiner, C. C. , Liu, S. , Gilbert, M. T. P. , Sinding, M. S. , Prost, S. , Guschanski, K. , Nathan, S. K. S. S. , Brace, S. , Chan, Y. L. , Wheat, C. W. , Skoglund, P. , Ryder, O. A. , Goossens, B. , … Dalén, L. (2021). Genomic insights into the conservation status of the world's last remaining Sumatran rhinoceros populations. Nature Communications, 12, 2393. 10.1038/s41467-021-22386-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Walker, B. J. , Abeel, T. , Shea, T. , Priest, M. , Abouelliel, A. , Sakthikumar, S. , Cuomo, C. A. , Zeng, Q. , Wortman, J. , Young, S. K. , & Earl, A. M. (2014). Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One, 9(11), e112963. 10.1371/journal.pone.0112963 [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Wan, Q. H. , Pan, S. K. , Hu, L. , Zhu, Y. , Xu, P. W. , Xia, J. Q. , Chen, H. , He, G. Y. , He, J. , Ni, X. W. , Hou, H. L. , Liao, S. G. , Yang, H. Q. , Chen, Y. , Gao, S. K. , Ge, Y. F. , Cao, C. C. , Li, P. F. , Fang, L. M. , … Fang, S. G. (2013). Genome analysis and signature discovery for diving and sensory properties of the endangered Chinese alligator. Cell Research, 23(9), 1091–1105. 10.1038/cr.2013.104 [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Wang, K. , Li, M. , & Hakonarson, H. (2010). ANNOVAR: functional annotation of genetic variants from high‐throughput sequencing data. Nucleic Acids Research, 38(16), e164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Wang, P. , Burley, J. T. , Liu, Y. , Chang, J. , Chen, L. Q. , … Zhang, Z. (2021). Genomic consequences of long‐term population decline in brown eared pheasant. Molecular Biology and Evolution, 38(1), 263–273. 10.1093/molbev/msaa213 [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Wang, S. , Zhang, J. , Jiao, W. , Li, J. , Xun, X. , Sun, Y. , Guo, X. , Huan, P. , Dong, B. , Zhang, L. , Hu, X. , Sun, X. , Wang, J. , Zhao, C. , Wang, Y. , Wang, D. , Huang, X. , Wang, R. , Lv, J. , … Bao, Z. (2017). Scallop genome provides insights into evolution of bilaterian karyotype and development. Nature Ecology Evolution, 1(5), 120. 10.1038/s41559-017-0120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Weir, B. S. , & Cockerham, C. C. (1984). Estimating F‐Statistics for the Analysis of Population Structure. Evolution, 38(6), 1358–1370. 10.2307/2408641 [DOI] [PubMed] [Google Scholar]
  94. Wen, R. S. (2000). The rise and fall of Alligator sinensis and vicissitudes of environment. Nature Magazine, 1, 55–58. [Google Scholar]
  95. Westbury, M. V. , Petersen, B. , Garde, E. , Heide‐Jørgensen, M. P. , & Lorenzen, E. D. (2019). Narwhal genome reveals long‐term low genetic diversity despite current large abundance size. IScience, 15, 592–599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Willi, Y. , Buskirk, J. V. , & Hoffmann, A. A. (2006). Limits to the adaptive potential of small populations. Annual Review of Ecology, Evolution, and Systematics, 37(1), 433–458. 10.1146/annurev.ecolsys.37.091305.110145 [DOI] [Google Scholar]
  97. Wu, L. S. , Wu, X. B. , Jiang, H. X. , & Wang, C. L. (2006). The analysis on the reproductive ability of Chinese alligator (Alligator sinensis) in captive population in Anhui Province and the anticipation of population increase. Acta Hydrobiologica Sinica, 30(2), 160–165. [Google Scholar]
  98. Wu, T. , Hu, E. , Xu, S. , Chen, M. , Guo, P. , Dai, Z. , Feng, T. , Zhou, L. , Tang, W. , Zhan, L. , Fu, X. , Liu, S. , Bo, X. , & Yu, G. (2021). clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. The Innovation (New York), 2(3), 100141. 10.1016/j.xinn.2021.100141 [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Wu, X. B. , Wang, Y. Q. , Zhou, K. Y. , Nie, J. S. , Wang, C. L. , & Xie, W. S. (1999). Analysis on reproduction of captive population of Alligator sinensis in Xuanzhou, Anhui. Chinese Journal of Applied & Environmental Biology, 5(6), 585–588. [Google Scholar]
  100. Wu, X.‐B. , Wang, Y.‐Q. , Zhou, K.‐Y. , Zhu, W.‐Q. , Nie, J.‐S. , Wang, C.‐L. , & Xie, W.‐S. (2002). Genetic vatiation in captive population of chinese alligator, Alligator sinensis, revealed by random amplified polymorphic DNA (RAPD). Biological Conservation, 106, 435–441. [Google Scholar]
  101. Xie, H.‐X. , Liang, X.‐X. , Chen, Z.‐Q. , Li, W.‐M. , Mi, C.‐R. , Li, M. , Wu, Z. J. , Zhou, X. M. , & Du, W. G. (2022). Ancient demographics determine the effectiveness of genetic purging in endangered lizards. Molecular Biology and Evolution, 39(1), msab359. 10.1093/molbev/msab359 [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Xue, Y. , Prado‐Martinez, J. , Sudmant, P. H. , Narasimhan, V. , Ayub, Q. , Szpak, M. , Frandsen, P. , Chen, Y. , Yngvadottir, B. , Cooper, D. N. , de Manuel, M. , Hernandez‐Rodriguez, J. , Lobon, I. , Siegismund, H. R. , Pagani, L. , Quail, M. A. , Hvilsom, C. , Mudakikwa, A. , Eichler, E. E. , … Scally, A. (2015). Mountain gorilla genomes reveal the impact of long‐term population decline and inbreeding. Science, 348(6231), 242–245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Yang, Z. (2007). PAML 4: Phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution, 24(8), 1586–1591. [DOI] [PubMed] [Google Scholar]
  104. Yi, X. , Liang, Y. , Huerta‐Sanchez, E. , Jin, X. , Cuo, Z. X. , Pool, J. E. , Xu, X. , Jiang, H. , Vinckenbosch, N. , Korneliussen, T. S. , Zheng, H. , Liu, T. , He, W. , Li, K. , Luo, R. , Nie, X. , Wu, H. , Zhao, M. , Cao, H. , … Wang, J. (2010). Sequencing of 50 human exomes reveals adaptation to high altitude. Science, 329, 75–78. 10.1126/science.1190371 [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Yin, W. , Wang, Z.‐J. , Li, Q.‐Y. , Lian, J.‐M. , Zhou, Y. , Lu, B.‐Z. , Jin, L. J. , Qiu, P. X. , Zhang, P. , Zhu, W. B. , Wen, B. , Huang, Y. J. , Lin, Z. L. , Qiu, B. T. , Su, X. W. , Yang, H. M. , Zhang, G. J. , Yan, G. M. , & Zhou, Q. (2016). Evolutionary trajectories of snake genes and genomes revealed by comparative analyses of five‐pacer viper. Nature Communications, 7(1), 13107. 10.1038/ncomms13107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Yu, G. , Wang, L. G. , Han, Y. , & He, Q. Y. (2012). clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology, 16(5), 284–287. 10.1089/omi.2011.0118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Zeng, C. , Ye, Q. , & Fang, S. (2011). Establishment and cryopreservation of liver, heart and muscle cell lines derived from the Chinese alligator (Alligator sinensis). Chinese Science Bulletin, 56(24), 2576–2579. 10.1007/s11434-011-4622-9 [DOI] [Google Scholar]
  108. Zhai, T. , Yang, H. Q. , Zhang, R. C. , Fang, L. M. , Zhong, G. H. , & Fang, S. G. (2017). Effects of Population Bottleneck and Balancing Selection on the Chinese Alligator Are Revealed by Locus‐Specific Characterization of MHC Genes. Scientific Reports, 7, 5549. 10.1038/s41598-017-05640-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Zhang, C. , Dong, S.‐S. , Xu, J.‐Y. , He, W.‐M. , & Yang, T.‐L. (2018). PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format files. Bioinformatics, 35(10), 1786–1788. 10.1093/bioinformatics/bty875 [DOI] [PubMed] [Google Scholar]
  110. Zhang, H. , Cheng, H. , Cai, Y. , Spötl, C. , Kathayat, G. , Sinha, A. , Edwards, L. , & Tan, L. (2018). Hydroclimatic variations in southeastern China during the 4.2 ka event reflected by stalagmite records. Climate of the Past Discussions, 14, 1805–1817. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix S1

Data Availability Statement

The data that support the findings of this study have been deposited into CNGB Sequence Archive (CNSA) (Guo et al., 2020) of the China National GeneBank DataBase (CNGBdb) (Chen et al., 2020) with accession number CNP0002575. The mRNA‐Seq and sRNA‐Seq data of Chinese alligator generated in our previous work were used for genome annotation, which had been deposited to NCBI SRA database under BioProject accession numbers PRJNA556093, and PRJNA556092, respectively.


Articles from Molecular Ecology Resources are provided here courtesy of Wiley

RESOURCES