Skip to main content
Wellcome Open Research logoLink to Wellcome Open Research
. 2021 May 14;6:112. [Version 1] doi: 10.12688/wellcomeopenres.16631.1

The genome sequence of the European golden eagle, Aquila chrysaetos chrysaetos Linnaeus 1758

Dan Mead 1,2, Rob Ogden 3, Anna Meredith 3,4, Gabriela Peniche 3, Michelle Smith 1, Craig Corton 1, Karen Oliver 1, Jason Skelton 1, Emma Betteridge 1, Jale Doulcan 1,5, Nadine Holmes 6, Victoria Wright 6, Matt Loose 6, Michael A Quail 1, Shane A McCarthy 1,7, Kerstin Howe 1, William Chow 1, James Torrance 1, Joanna Collins 1, Richard Challis 1, Richard Durbin 1,7, Mark Blaxter 1,a
PMCID: PMC8499043  PMID: 34671705

Abstract

We present a genome assembly from an individual female Aquila chrysaetos chrysaetos (the European golden eagle; Chordata; Aves; Accipitridae). The genome sequence is 1.23 gigabases in span. The majority of the assembly is scaffolded into 28 chromosomal pseudomolecules, including the W and Z sex chromosomes.

Keywords: Aquila chrysaetos, European golden eagle, genome sequence, chromosomal

Species taxonomy

Eukaryota; Metazoa; Chordata; Vertebrata; Aves; Accipitriformes; Accipitridae; Accipitrinae; Aquila; Aquila chrysaetos subspecies chrysaetos Linnaeus 1758 (NCBI:txid223781).

Introduction

The golden eagle, Aquila chrysaetos, is an apex predator with a range that spans the Holarctic. It has been divided into six subspecies, with the nominate European subspecies, A. chrysaetos chrysaetos found across Europe, except for the Iberian peninsula, and extending eastwards in Russia as far as western Siberia. However, mitochondrial sequence and microsatellite analyses suggest that only two major clades exist within the species, a globally distributed northern clade and a distinct Mediterranean clade ( Nebel et al., 2015; Nebel et al., 2019; Sato et al., 2017). Formerly widespread, A. chrysaetos chrysaetos is now confined to wilderness areas. Once found throughout Britain and Ireland, the golden eagle was extirpated from England and Wales by 1850 and in Ireland by 1912. The golden eagle was particularly badly impacted by bioaccumulating pesticides in the late 20th century ( Watson et al., 2010). A single pair nested in the English Lake District from 1969–2004, but this has not led to a sustained recolonisation. There is ongoing monitoring of the remaining population in Scotland, where deliberate persecution is thought to be a major threat ( Fielding et al., 2006).

Genome sequence report

The genome was sequenced from a single female A. chrysaetos chrytsaetos collected by Gabriela Peniche under UK Home Office project licence PB8A1D5C7. A total of 46-fold coverage in Pacific Biosciences single-molecule long reads (N50 19 kb) and 47-fold coverage in 10X Genomics read clouds (from molecules with an estimated N50 of 68 kb) were generated. Primary assembly contigs were scaffolded with chromosome conformation HiC data. The HiC scaffolds were validated using BioNano long-range restriction maps (140-fold effective coverage in molecules of N50 310 kb). The final assembly has a total length of 1.23 Gb in 145 sequence scaffolds with a scaffold N50 of 46.9 Mb ( Table 1). The majority, 99.0%, of the assembly sequence was assigned to 28 chromosomal-level scaffolds representing 26 autosomes (numbered by sequence length), and the W and Z sex chromosomes ( Figure 1Figure 4; Table 2). The assembly has a BUSCO ( Simão et al., 2015) completeness of 97.4% using the aves_odb10 reference set. While not fully phased, the assembly deposited is of one haplotype. Contigs corresponding to the second haplotype have also been deposited. The A. chrysaetos chrysaetos assembly has equivalent span and scaffold-level contiguity to an assembly of A. chrysaetos canadiensis (NCBI:txid216574) produced by the Erez Lieberman Aiden lab as part of the DNAZoo project.

Table 1. Genome data for Aquila chrysaetos chrysaetos bAquChr1.4.

Project accession data
Assembly identifier bAquChr1.4
Species Aquila chrysaetos chrysaetos
Specimen GE037-17
NCBI taxonomy ID 223781
BioProject PRJEB27699
Biosample ID SAMEA994725
Isolate information Female, heart muscle tissue
Raw data accessions
PacificBiosciences
SEQUEL I
ERR2980431, ERR2980432,
ERR2980435, ERR2980436,
ERR2980437, ERR2980438,
ERR2980439, ERR2980440,
ERR2980448, ERR2980449,
ERR2980450, ERR2980451,
ERR2990043, ERR2990044,
ERR3013207, ERR3013208
10X Genomics Illumina ERR3316065, ERR3316066,
ERR3316067, ERR3316068
Hi-C Illumina ERR3312497
BioNano ERZ1392826
Genome assembly
Assembly accession GCA_900496995.4
Accession of alternate
haplotype
GCA_902153765.2
Span (Mb) 1,233
Number of contigs 373
Contig N50 length (Mb) 22
Number of scaffolds 145
Scaffold N50 length (Mb) 47
Longest scaffold (Mb) 85
BUSCO * genome score C:97.4%[S:96.7%,D:0.7%],
F:0.6%,M:2.1%,n:8338

* BUSCO scores based on the aves_odb10 BUSCO set using v5.0.0. C= complete [S= single copy, D=duplicated], F=fragmented, M=missing, n=number of orthologues in comparison. A full set of BUSCO scores is available at https://blobtoolkit.genomehubs.org/view/Aquila%20chrysaetos%20chrysaetos/dataset/UFQG04/busco.

Figure 1. Genome assembly of Aquila chrysaetos chrysaetos bAquChr1.4.

Figure 1.

BlobToolKit Snailplot. The plot shows N50 metrics for bAquChr1.4 and BUSCO scores for the Aves set of orthologues. Interactive version available at https://blobtoolkit.genomehubs.org/view/Aquila%20chrysaetos%20chrysaetos/dataset/UFQG04/snail.

Figure 2. Genome assembly of Aquila chrysaetos chrysaetos bAquChr1.4.

Figure 2.

BlobToolKit GC-coverage plot. Interactive version available at https://blobtoolkit.genomehubs.org/view/Aquila%20chrysaetos%20chrysaetos/dataset/UFQG04/blob?plotShape=circle.

Figure 3. Genome assembly of Aquila chrysaetos chrysaetos bAquChr1.4.

Figure 3.

BlobToolKit Cumulative sequence plot. Interactive version available at https://blobtoolkit.genomehubs.org/view/Aquila%20chrysaetos%20chrysaetos/dataset/UFQG04/cumulative.

Figure 4. Genome assembly of Aquila chrysaetos chrysaetos bAquChr1.4.

Figure 4.

Hi-C contact map. Hi-C contact map of the bAquChr1.4 assembly, visualized in HiGlass.

Table 2. Chromosomal pseudomolecules in the genome assembly of Aquila chrysaetos chrysaetos bAquChr1.4.

ENA accession Chromosome Size (Mb) GC%
LR606181.1 1 85.46 40.2
LR606182.1 2 83.00 41.1
LR606183.1 3 79.38 41.8
LR606184.1 4 77.27 40.4
LR606185.1 5 76.62 42.5
LR606186.1 6 54.40 42.2
LR606187.1 7 47.78 41
LR606188.1 8 46.94 43.6
LR606189.1 9 45.24 44.1
LR606190.1 10 43.95 43.5
LR606191.1 11 43.76 42.2
LR606192.1 12 43.48 43.3
LR606193.1 13 41.79 42.3
LR606194.1 14 34.34 39.2
LR606195.1 15 30.98 42.7
LR606196.1 16 30.61 43.9
LR606197.1 17 29.70 42.6
LR606198.1 18 28.56 40.4
LR606199.1 19 27.98 41.9
LR606200.1 20 25.31 43.1
LR606201.1 21 24.76 43.1
LR606202.1 22 22.51 44.3
LR606203.1 23 21.01 41.1
LR606204.1 24 20.99 48.3
LR606205.1 25 19.84 44.3
LR606206.1 26 17.72 40.7
HG999777.1 W 11.73 44.1
LR606180.1 Z 88.22 40.7
- unplaced 30.39 48.4

As the Hi-C data was sourced from a male bird, it was not possible to fully construct the W chromosome for the female sample bAquChr1. Scaffolds identified as belonging to W have therefore been submitted as unordered fragments. The largest of these fragments has been designated as the W Chromosome and all other W scaffolds labelled as W_unloc.

Methods

The golden eagle specimen was collected, following death by natural causes, from an area 15 km from the Highland village of Fort Augustus, Scotland, under UK Home Office project licence no. PB8A1D5C7. The heart was dissected out during autopsy. The specimen is preserved frozen at the University of Edinburgh.

DNA was extracted from heart tissue following the BioNano protocol. Pacific Biosciences CLR long read and 10X Genomics read cloud sequencing libraries were constructed according to manufacturers’ instructions. Sequencing was performed by the Scientific Operations core at the Wellcome Sanger Institute on Pacific Biosciences SEQUEL I and Illumina HiSeq X instruments. Hi-C data were generated using the Dovetail HiC library preparation kit at WSI.

BioNano data were generated in DeepSeq, University of Nottingham. High Molecular Weight genomic DNA (HMW gDNA) was extracted from an agarose plug (bAquChr (Golden Eagle); Plug 2) that had been prepared and shipped according to the Bionano Agarose Plug Shipping Instructions. The Bionano Prep Animal Tissue DNA Isolation Soft Tissue Protocol (Document Number: 30077; Document Revision: C) was used to complete HMW gDNA extraction. DNA quantitation, using the Qubit Fluorometer and the Qubit dsDNA BR kit (ThermoFisher; Q32853), gave a mean concentration of 117 ng/ul (CV = 0.118). Labelling was performed with an input of 750 ng of HMW gDNA, using the DLS DNA Labeling Kit (Bionano: 80003) and the Bionano Prep Direct Label and Stain (DLS) Protocol (Document Number: 30206; Document Revision: D). The labelled sample was quantified by Qubit Fluorometer and the Qubit dsDNA HS Assay Kit (ThermoFisher: Q32854). The average concentration of the labelled sample was 4.37 ng/ul (CV = 0.042). The labelling reaction was run over one flowcell of a Bionano Saphyr Chip (Bionano: 20319) on the Bionano Saphyr (Bionano; 60239) running software versions - Bionano Access: 1.2.2; Bionano Tools: 7921; Bionano Solve: Solve3.2.2_08222018; RefAligner: 7782.7865rel; HybridScaffold/SVMerge/VariantAnnotation: 08222018. For analysis, the molecule file was used to generate a de novo assembly using the default Bionano Access settings. This assembly was used to generate a hybrid scaffold from the reference ufqg01.fasta. The hybrid scaffold was constructed using default settings; conflict resolution was set to ‘Resolve Conflicts’ for both the Bionano assembly and sequence assembly.

Assembly was carried out using Falcon-unzip (falcon-kit 1.1.1) ( Chin et al., 2016), haplotypic duplication was identified and removed with purge_dups ( Guan et al., 2020) and a first round of scaffolding carried out with 10X Genomics read clouds using scaff10x. Hybrid scaffolding was performed using the BioNano DLE-1 data and Bionano Solve v3.3. The Hi-C scaffolded assembly was polished with arrow using the PacBio data, then polished with the 10X Genomics Illumina data by aligning to the assembly with longranger align, calling variants with freebayes ( Garrison & Marth, 2012) and applying homozygous non-reference edits using bcftools consensus. Two rounds of the Illumina polishing were applied. The assembly was checked for contamination and manually corrected using the gEVAL system ( Chow et al., 2016; Howe et al., 2021). This reduced the sequence length by 2.2% and the scaffold count by 44.7% whilst increasing the scaffold N50 by 4.4%. The genome was analysed within the BlobToolKit environment ( Challis et al., 2020). Software versions are given in Table 3.

Table 3. Software tools used.

Software tool Version Source
Falcon-unzip falcon-kit 1.2.2 ( Chin et al., 2016)
purge_dups 1.0.0 ( Guan et al., 2020)
SALSA2 2.2 ( Ghurye et al., 2019)
scaff10x 4.2 https://github.com/wtsi-hpag/Scaff10X
arrow GenomicConsensus 2.3.3 https://github.com/PacificBiosciences/GenomicConsensus
longranger align 2.2.2 https://support.10xgenomics.com/genome-exome/software/
pipelines/latest/advanced/other-pipelines
freebayes v1.1.0-3-g961e5f3 ( Garrison & Marth, 2012)
bcftools consensus 1.9 http://samtools.github.io/bcftools/bcftools.html
HiGlass 1.11.6 ( Kerpedjiev et al., 2018)
PretextView 0.0.4 https://github.com/wtsi-hpag/PretextView
gEVAL N/A ( Chow et al., 2016)
BlobToolKit 2.5 ( Challis et al., 2020)

Data availability

Underlying data

European Nucleotide Archive: Aquila chrysaetos chrysaetos (European golden eagle) genome assembly, Accession number PRJEB33202 https://www.ebi.ac.uk/ena/browser/view/PRJEB33202.

The genome sequence is released openly for reuse. The A. chrysaetos chrysaetos genome sequencing initiative is part of the Wellcome Sanger Institute’s “ 25 genomes for 25 years” project. It is also part of the Vertebrate Genome Project (VGP) ordinal references programme and the Darwin Tree of Life (DToL) project. The specimen has been preserved at Edinburgh University and will be deposited in CryoArks. All raw data and the assembly have been deposited in the ENA. The genome will be annotated and presented through the Ensembl pipeline at the European Bioinformatics Institute. Raw data and assembly accession identifiers are reported in Table 1.

Acknowledgements

We thank Mike Stratton and Julia Wilson for their continuing support for the 25 genomes for 25 years project. We thank Erez Lieberman Aiden, Olga Dudchenko and especially Arina Omer for advice on Hi-C sequencing and access to the A. chrysaetos canadiensis genome in advance of publication.

Funding Statement

This work was supported by the Wellcome Trust through core funding to the Wellcome Sanger Institute (206194) and the Darwin Tree of Life Discretionary Award 218328. SMcC and RD were supported by Wellcome grant 207492. The Bionano Saphyr at Deep Seq was funded by the Future Food Beacon at the University of Nottingham.

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

[version 1; peer review: 3 approved]

References

  1. Challis R, Richards E, Rajan J, et al. : BlobToolKit – Interactive Quality Assessment of Genome Assemblies. G3 (Bethesda). 2020;10(4):1361–1374. 10.1534/g3.119.400908 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Chin CS, Paul Peluso FJ, Sedlazeck MN, et al. : Phased Diploid Genome Assembly with Single-Molecule Real-Time Sequencing. Nat Methods. 2016;13(12):1050–54. 10.1038/nmeth.4035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Chow W, Brugger K, Caccamo M, et al. : gEVAL — a web-based browser for evaluating genome assemblies. Bioinformatics. 2016;32(16):2508–2510. 10.1093/bioinformatics/btw159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Fielding AH, Whitfield DP, McLeod DRA: Spatial association as an indicator of the potential for future interactions between wind energy developments and golden eagles Aquila chrysaetos in Scotland. Biological Conservation. 2006;131(3):359–369. 10.1016/j.biocon.2006.02.011 [DOI] [Google Scholar]
  5. Garrison E, Marth G: Haplotype-based variant detection from short-read sequencing. arXiv: 1207.3907. 2012. Reference Source [Google Scholar]
  6. Ghurye J, Rhie A, Walenz BP, et al. : Integrating Hi-C links with assembly graphs for chromosome-scale assembly. PLoS Comput Biol. 2019;15(8):e1007273. 10.1371/journal.pcbi.1007273 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Guan D, McCarthy SA, Wood J, et al. : Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics. 2020;36(9):2896–2898. 10.1093/bioinformatics/btaa025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Howe K, Chow W, Collins J, et al. : Significantly improving the quality of genome assemblies through curation. Gigascience. 2021;10(1):giaa153. 10.1093/gigascience/giaa153 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Kerpedjiev P, Abdennur N, Lekschas F, et al. : HiGlass: Web-Based Visual Exploration and Analysis of Genome Interaction Maps. Genome Biol. 2018;19(1):125. 10.1186/s13059-018-1486-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Nebel C, Gamauf A, Haring E, et al. : New insights into population structure of the European golden eagle ( Aquila chrysaetos) revealed by microsatellite analysis. Biological Journal of the Linnean Society. 2019;128(3):611–631. 10.1093/biolinnean/blz130 [DOI] [Google Scholar]
  11. Nebel C, Gamauf A, Haring E, et al. : Mitochondrial DNA analysis reveals Holarctic homogeneity and a distinct Mediterranean lineage in the Golden eagle ( Aquila chrysaetos). Biological Journal of the Linnean Society. 2015;116(2):328–340. 10.1111/bij.12583 [DOI] [Google Scholar]
  12. Sato Y, Ogden R, Komatsu M, et al. : Integration of wild and captive genetic management approaches to support conservation of the endangered Japanese golden eagle. Biological Conservation. 2017;213:175–184. 10.1016/j.biocon.2017.07.008 [DOI] [Google Scholar]
  13. Simão FA, Waterhouse RM, Ioannidis P, et al. : BUSCO: Assessing Genome Assembly and Annotation Completeness with Single-Copy Orthologs. Bioinformatics. 2015;31(19):3210–12. 10.1093/bioinformatics/btv351 [DOI] [PubMed] [Google Scholar]
  14. Watson J, Brockie K, Watson D: The golden eagle.New Haven, Yale University Press. 2010. Reference Source [Google Scholar]
Wellcome Open Res. 2021 Nov 15. doi: 10.21956/wellcomeopenres.18335.r46304

Reviewer response for version 1

Xiang-jiang Zhan 1, Zhongru Gu 1

This data note presents the chromosomal genome assembly of the European golden eagle, Aquila chrysaetos chrysaetos. Although it was not the first golden eagle genome assembly reported , the authors denovo sequenced a new individual using hybrid techniques including PacBio, 10X Genomics, Hi-C and BioNano. It will facilitate the comparative and population genomics analysis for golden eagle in future. In general, the report was clearly written and the authors did a good job for this top predator. I have only two comments:

  1. There are several inconsistencies in the description of software. The version of Falcon-unzip is inconsistent between the method part (falcon-kit 1.1.1) and Table 3 (1.2.2). The SALSA2 listed in Table 3 is not mentioned in the method. The Bionano Solve (v3.3) is mentioned in the method, but not in Table 3.

  2. The authors could increase the font size in figure 2 and 3.

Are sufficient details of methods and materials provided to allow replication by others?

Partly

Is the rationale for creating the dataset(s) clearly described?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Are the protocols appropriate and is the work technically sound?

Yes

Reviewer Expertise:

Conservation genetics of raptorial birds

We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Wellcome Open Res. 2021 Oct 7. doi: 10.21956/wellcomeopenres.18335.r46301

Reviewer response for version 1

Carles Lalueza-Fox 1

This is a comprehensible and well-written report on the generation of a golden eagle female specimen. Golden eagles have ecological importance for being top predators among birds, but also as religious and political icons along history in many different regions and countries. Their genome can provide the basis for further research on their specific adaptations. The details of the sequencing effort and the quality of the assembly and annotation are sound; overall, I think this would be a valuable resource for the scientific community.

I only have a minor comment; maybe the authors should cite a previous work (Doyle et al. 2014 1 ) where a male from another subspecies ( A. c. canadiensis) from Sierra Nevada (USA) was sequenced in 2,552 scaffolds. Obviously, due to technological improvements, the current annotation is much better, but future researchers could likely use the previous one to explore inter subspecies diversity.

Also, as a minor comment, Linnaeus 1758 should be between brackets.

Are sufficient details of methods and materials provided to allow replication by others?

Yes

Is the rationale for creating the dataset(s) clearly described?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Are the protocols appropriate and is the work technically sound?

Yes

Reviewer Expertise:

Evolutionary Genomics, Paleogenomics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

References

  • 1. : The genome sequence of a widespread apex predator, the golden eagle (Aquila chrysaetos). PLoS One .2014;9(4) : 10.1371/journal.pone.0095599 e95599 10.1371/journal.pone.0095599 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wellcome Open Res. 2021 Jun 10. doi: 10.21956/wellcomeopenres.18335.r43937

Reviewer response for version 1

Bengt Hansson 1

I think this is a well-written Data Note, which presents an important resource for further genetic analyses of the golden eagle and related species. I have only two comments.

  1. First, the focus on UK and Ireland in the Introduction seems a bit arbitrary for a general readership. Similar stories about population declines can be applied to several countries. Perhaps, simply, a well-placed “For example” (or similar) would be enough?

  2. Secondly, the HiC individual is not included in Table 1, which then provides incorrect information.

Are sufficient details of methods and materials provided to allow replication by others?

Yes

Is the rationale for creating the dataset(s) clearly described?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Are the protocols appropriate and is the work technically sound?

Yes

Reviewer Expertise:

Population genomics and genetics of birds

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Availability Statement

    Underlying data

    European Nucleotide Archive: Aquila chrysaetos chrysaetos (European golden eagle) genome assembly, Accession number PRJEB33202 https://www.ebi.ac.uk/ena/browser/view/PRJEB33202.

    The genome sequence is released openly for reuse. The A. chrysaetos chrysaetos genome sequencing initiative is part of the Wellcome Sanger Institute’s “ 25 genomes for 25 years” project. It is also part of the Vertebrate Genome Project (VGP) ordinal references programme and the Darwin Tree of Life (DToL) project. The specimen has been preserved at Edinburgh University and will be deposited in CryoArks. All raw data and the assembly have been deposited in the ENA. The genome will be annotated and presented through the Ensembl pipeline at the European Bioinformatics Institute. Raw data and assembly accession identifiers are reported in Table 1.


    Articles from Wellcome Open Research are provided here courtesy of The Wellcome Trust

    RESOURCES