Chromosome-Scale Genome Assembly and Annotation of Allotetraploid Annual Bluegrass (Poa annua L.)

Matthew D Robbins; B Shaun Bushman; David R Huff; Christopher W Benson; Scott E Warnke; Chase A Maughan; Eric N Jellen; Paul G Johnson; Peter J Maughan

doi:10.1093/gbe/evac180

. 2022 Dec 28;15(1):evac180. doi: 10.1093/gbe/evac180

Chromosome-Scale Genome Assembly and Annotation of Allotetraploid Annual Bluegrass (Poa annua L.)

Matthew D Robbins ^1,^✉, B Shaun Bushman ², David R Huff ³, Christopher W Benson ⁴, Scott E Warnke ⁵, Chase A Maughan ⁶, Eric N Jellen ⁷, Paul G Johnson ⁸, Peter J Maughan ⁹

Editor: Maud Tenaillon

PMCID: PMC9838796 PMID: 36574983

Abstract

Poa annua L. is a globally distributed grass with economic and horticultural significance as a weed and as a turfgrass. This dual significance, and its phenotypic plasticity and ecological adaptation, have made P. annua an intriguing plant for genetic and evolutionary studies. Because of the lack of genomic resources and its allotetraploid (2n = 4x = 28) nature, a reference genome sequence would be a valuable asset to better understand the significance and polyploid origin of P. annua. Here we report a genome assembly with scaffolds representing the 14 haploid chromosomes that are 1.78 Gb in length with an N50 of 112 Mb and 96.7% of BUSCO orthologs. Seventy percent of the genome was identified as repetitive elements, 91.0% of which were Copia- or Gypsy-like long-terminal repeats. The genome was annotated with 76,420 genes spanning 13.3% of the 14 chromosomes. The two subgenomes originating from Poa infirma (Knuth) and Poa supina (Schrad) were sufficiently divergent to be distinguishable but syntenic in sequence and annotation with repetitive elements contributing to the expansion of the P. infirma subgenome.

Keywords: long-read sequencing, genome assembly, genome annotation, polyploidy, turfgrass, Poaceae

Significance.

Poa annua is a widely distributed cool-season grass with ecological and horticultural significance. Here we present a high-quality, chromosome-level reference genome and annotation for P. annua that identifies subgenome components. This reference genome is a valuable contribution to investigate the genetic mechanisms that contribute to P. annua’s wide morphological variability.

Introduction

Poa annua (L.), or annual bluegrass, is an allotetraploid (2n = 4x = 28) in the Ochlopoa section of the Poaceae family (Gillespie and Soreng 2005), with Poa infirma (Knuth) and Poa supina (Schrad) considered its diploid ancestral genomes (Nannfeldt 1937; Tutin 1952; Soreng et al. 2010; Mao and Huff 2012). Despite its name, P. annua occurs as a continuum between annual and perennial forms (Heide 2001). Highly adaptable, exhibiting phenotypic plasticity for internode lengths (La Mantia and Huff 2011), P. annua has a broad distribution from continental climates to antarctica (Mao and Huff 2012; Molina-Montenegro et al. 2012). It is also a common noxious weed across the world such that eradicating, controlling, or acquiescing to P. annua has been a challenge for restoration ecologists, turfgrass researchers, and land managers for decades (Huff 2003).

Poa annua predominantly self-pollinates (Tutin 1957; Ellis et al. 1971), likely an offshoot of its P. infirma subgenome component, and its heterozygosity level is therefore relatively low compared to out-crossing cool-season grasses. Poa annua's genome 1C content has been broadly estimated between 2.90 and 1.94 pg (Bennett 1972; Mowforth and Grime 1989; Mao and Huff 2012). Cytogenetic data identified 2 larger chromosomes and 12 smaller chromosomes (Koshy 1968), but the progenitor species of the 2 largest chromosomes was unclear. No genomic reference sequences exist for P. annua and the nearest relative with a reference genome is Brachypodium distachyon (L.) P. Beauv. or barley (Hordeum vulgare L.), with the latter also characterized by a base chromosome number of seven (x = 7). Complementary DNA (cDNA) libraries have been created to investigate the relationship of P. annua with its progenitor species, and several patent sequences for herbicide target genes have been identified (Chen et al. 2016). Considering its ecological and horticultural importance, a reference genome of P. annua is critical to understand its subgenome relationships and the genetic contributions to its adaptability. Herein we report a chromosome-scale reference genome of a P. annua genotype from the National Plant Germplasm System accession PI 595837.

Results and Discussion

Genome Sequencing and Assembly

Sequencing the P. annua genome with PacBio HiFi technology produced 91.8 Gb, which is 49× coverage of the 1.89 GB genome size estimated by flow cytometry (Mao and Huff 2012). Analysis of k-mers from the PacBio HIFi reads provided estimates of 1.76 GB in length with 11.1% heterozygosity and 81.8% repetitiveness. The initial haploid assembly from hifiasm had 1,729 contigs with a total length of 1.85 GB, an N50 of 65 MB, a BUSCO coverage of 96.7% (see Supplementary material online), and an alignment rate of 99.88% of the original HiFi reads. Omni-C scaffolding produced chromosome-level scaffolds and raised the N50 to 112 MB. Of the 1,574 scaffolds, 32 were identified as contaminants, 949 were chloroplast sequences, 256 were mitochondrial sequences, and 311 were repetitive sequences (see Supplementary material online). The remaining 26 scaffolds could be distinguished based on size differences: 12 small, uncharacterized scaffolds range from 23.3 to 419 kb spanning a total of 17.7 Mb while the 14 largest scaffolds ranged from 72.5 to 320.7 Mb spanning a total of 1.78 Gb, putatively representing the 14 chromosomes of the haploid P. annua genome (fig. 1B). These 14 pseudomolecules contain 99.99% of the total assembly length, contain all 96.7% of the BUSCO orthologs, and comprise 93.7% of the 1.89 GB estimated genome size. Together, these results indicate the 14 pseudomolecules are a highly complete representation of the 14 chromosomes of the P. annua genome.

Fig. 1. — Characteristics of the 14 *P. annua* chromosomes in the presented assembly. (A) Circular representation of the chromosomes. Graphs are, from outside towards center, chromosome length in MB (solid bar with chromosome name) and distributions of GC content, gene number, Gypsy/DIRS1-like repeats, Ty1/Copia-like repeats, and telomeres in 500 kb windows, except GC content, which is in 2 MB windows. (B) Karyotype of PI 595837. (C) Comparison of homoeologous chromosomes between A and B subgenomes based on genomic sequence. (D) Visualization of synteny and collinearity between A and B subgenomes based on gene annotations.

Previously developed cDNA libraries from P. annua, P. infirma, and P. supina (Chen et al. 2016) were mapped to the 14 chromosomes. Ninety-seven percent of each of those 3 EST libraries mapped to the 14 chromosomes (see Supplementary material online). The P. infirma cDNAs clearly mapped to 7 of the 14 chromosomes and the P. supina cDNAs clearly mapped to the other 7 chromosomes (see Supplementary material online).

The P. annua chromosomes were named with a “Pa” prefix and “A” or “B” as a suffix based on progenitor subgenomes from P. infirma (A) and P. supina (B). The subgenome from P. infirma was designated A because it is the maternal parent source of tetraploidization events that led to P. annua (Soreng et al. 2010; Mao and Huff 2012; Chen et al. 2016). The number in the chromosome name is from 1 to 7 based on longest to shortest of the A subgenome.

Repeat Analysis and Gene Annotation

A total of 70.4% of the 14 chromosomes were repetitive sequence with 42.0% as retroelements, 3.3% as DNA transposons, and 25.9% as unclassified repetitive elements (table 1). Of the retroelements 91.0% were Copia- or Gypsy-like long-terminal repeat (LTR) elements, which were distributed throughout the genome but enriched near the center of most of the chromosomes (table 1 and fig. 1A). Since LTR Gypsy repeats include centromeric retrotransposons (Sharma and Presting 2014), the distribution of these repeats suggest P. annua chromosomes are metacentric or sub-metacentric, consistent with the karyotype (fig. 1B). Telomeric repeats were identified at the ends of chromosomes Pa1A, Pa4B, PA5B, and Pa7B (fig. 1A).

Table 1.

Poa annua Assembly Metrics

Metric	Pseudomolecules	A subgenome	B subgenome
Assembly
No. contigs	14	7	7
Largest contig	320,692,588	320,692,588	145,879,171
Total length	1,777,993,459	1,115,915,925	662,077,534
GC (%)	46	46	46
N50	112,260,586	264,562,364	98,219,837
N75	98,219,837	102,273,047	74,350,140
L50	5	2	3
L75	9	5	5
# N's per 100 kbp	1	1	1
BUSCO
Complete	3,128 (96.7%)	3,049 (94.2%)	3,047 (94.2%)
Complete single copy	336 (10.4%)	2,964 (91.6%)	2,944 (91.0%)
Complete duplicated	2,792 (86.3%)	115 (3.6%)	103 (3.2%)
Fragmented	94 (2.9%)	145 (4.5%)	168 (5.2%)
Missing	13 (0.4%)	42 (1.3%)	21 (0.7%)
Repeats
Ty1/Copia LTR	103,097 (12.35%)	72,581 (14.53%)	30,516 (8.67%)
Gypsy/DIRS1 LTR	186,117 (25.88%)	121,479 (29.03%)	64,638 (20.57%)
Ling interspersed nuclear elements	52,176 (2.12%)	26,039 (1.87%)	26,137 (2.53%)
DNA transposons	64,321 (3.33%)	32,286 (2.98%)	32,035 (3.94%)
Unclassified	912,643 (25.94%)	53,0195 (27.27%)	38,2448 (23.7%)
Total interspersed repeats	2,064,239 (70.38%)	1,265,324 (76.7%)	79,8915 (59.73%)
Annotation
No. of genes	76,420	37,817	38,603
No. of genes with functional annotation	49,132	24,141	24,991
Percent of assembly covered by genes	13.3	10.4	18.3
Mean gene length	3,104	3,076	3,132
Mean cds length	1,207	1,203	1,211
Mean no. of exons per gene	4.7	4.7	4.7
Mean exon length	333	333	332
Mean no. of introns per gene	3.4	3.4	3.4
Mean intron length	420	412	428

Open in a new tab

Metrics are reported for the pseudomolecules representing the 14 P. annua chromosomes and the 7 chromosomes of the A (P. infirma) and B (P. supina) subgenomes

Sequencing the transcriptome of P. annua using PacBio HiFi sequencing with Iso-Seq produced 31.0 Gb and 50,473 full-length transcripts with 519,743 isoforms. The transcriptome coverage was characterized with 96.5% complete BUSCOs, 6.2% as single-copy, and 89.4% as duplicated (tetraploidy) BUSCOs. Using this transcriptome with additional gene evidence from barley, wheat (Triticum aestivum L.), and Brachypodium produced 76,420 predicted gene models. These genes spanned 237.7 MB, or 13.3%, of the 14 chromosomes with an average length of 3,104 bp (table 1). Sixty-four percent of the predicted genes were functionally annotated using the UniProtKB/Swiss-Prot database. The BUSCO scores of the predicted transcripts were 94.9% complete, with 20.8% single-copy and 74.1% duplicated. The annotated genes were, in general, inversely distributed relative to the Gypsy- and Copia-like LTR repeats (fig. 1A).

Poa annua Subgenomes

The A subgenome (P. infirma) and the B subgenome (P. supina) were sufficiently divergent to be distinguishable, yet highly syntenic in sequence and gene content (fig. 1CandD). The full 14 P. annua chromosomes had 86% BUSCO duplication, while each of the subgenomes had <4% BUSCO duplication (table 1). The number of genes, exons, and introns, as well as the lengths of the genes, introns, and exons were similar between subgenomes (table 1). Gene annotations were in similar syntenic blocks near the distal portions of each homoeologous chromosome (fig. 1AandD), with some evidence of translocations between non-homoeologous chromosomes such as Pa1A and Pa4B (fig. 1D). The genomic sequence between the A and B subgenomes was also highly syntenic, although inversions were present in each homoeologous pair (fig. 1C).

Despite the homology and synteny in coding and non-coding regions, the A subgenome was 1.69 times the length of the B subgenome, and the two longest A chromosomes, Pa1A and Pa2A, are approximately two times longer than the largest B chromosome (fig. 1). These observations are consistent with the karyotype of PI 595837 (fig. 1B) and previous cytological reports (Nannfeldt 1937; Koshy 1968), and highlight that the two longest chromosomes both originated from the P. infirma ancestral genome. The disparity in subgenome size was mainly due to differences in repetitive elements between the subgenomes. The sequence of (P. infirma) subgenome A was 76.7% repetitive elements compared to 59.7% for (P. supina) subgenome B.

Materials and Methods

Sequencing and Assembly

PI 595837 is a collection of Minnesota, USA, characterized by a partial vernalization requirement. Sequencing was conducted on a PacBio Sequel II instrument at the BYU DNA Sequencing Center (DNASC; https://biology.byu.edu/dnasc) with five 8M SMRT cells.

To obtain estimates of genome size and heterozygosity, PacBio HiFi (Wenger et al. 2019) reads were analyzed using Jellyfish v. 2.2.9 (Marçais and Kingsford 2011) with the -C option, k-mer size of 21, and a max k-mer count of 1,000,000 and GenomeScope v. 2.0 (Ranallo-Benavidez et al. 2020) with k-mer length of 21 and ploidy of four. PacBio HiFi reads were assembled using hifiasm v. 0.14-r312 (Cheng et al. 2021) using default parameters. Assembled contigs were scaffolded by Dovetail Genomics LLC (Scotts Valley, CA, USA) using proximity ligation with their Omni-C and HiRise pipelines (Putnam et al. 2016; Ramachandran et al. 2021).

For the quality of the scaffold assembly, basic assembly metrics were produced using the—large option of QUAST v. 5.0.2 (Mikheenko et al. 2018), gene coverage was assessed with BUSCO v. 5.1.3 (Manni et al. 2021) using the liliopsida_odb10 dataset (creation date 9/10/2020), telomeres were identified by dividing the genome into 500 kb windows using the make windows command in BEDTools v. 2.26.0 (Quinlan and Hall 2010) and counting occurrences of the forward and reverse complement of the TTAGGG plant-type tetramer (Peska and Garcia 2020) in each window using the BEDTools nuc command with the—pattern option, and the original HiFi reads were mapped back to the assembly using the -x map-hifi preset option of minimap2 v. 2.22 (Li 2018). In addition, reads from cDNA libraries of P. annua, P. infirma, and P. supina (NCBI SRA accessions SRR1633980, SRR1634028, and SRR1634026) (Chen et al. 2016) were mapped to the assembly using HISAT2 v. 2.2.1 (Kim et al. 2019) with the—very-sensitive option to help determine assembly completeness.

Identification of Contamination, Organellar Sequence, and Repeats

Scaffold sequences were analyzed using BlobTools2 v. 2.6.4 (Challis et al. 2020) and Kraken 2 v. 2.1.1 (Wood et al. 2019) to identify potential contaminants. Data provided to BlobTools2 included blastn (blast+ v. 2.11.0, Camacho et al. 2009) hits of assembly scaffolds against the nt database, read coverage of HiFi reads mapped to assembly scaffolds using minimap2, and BUSCO scores of assembled scaffolds using the liliopsida_odb10 lineage dataset. The PlusPFP-16 RefSeq database (https://benlangmead.github.io/aws-indexes/k2) was used for Kraken 2. Scaffolds were deemed contaminants and removed if the taxon was assigned outside Streptophyta, the GC percentage was extreme (40% > GC > 50%), or the read coverage was extremely low (<5). Scaffolds with greater than 99% query coverage (qcovs) after aligning to the plastid RefSeq database (release 207; https://www.ncbi.nlm.nih.gov/refseq/) using blastn were considered chloroplast and removed from the assembly. Due to the complexity of the mitochondrial genome in plants (Palmer et al. 2000; Morley and Nielsen 2017; Kozik et al. 2019), scaffolds with the best nt database hit to mitochondrial genome sequence, or with >40% query coverage to the mitochondrial RefSeq database, or scaffolds for which all functional gene annotations were mitochondrial, were considered mitochondrial sequences and removed from the assembly. Repetitive elements in scaffolds were identified and classified using RepeatModeler2 v2.0.2a (Flynn et al. 2020) and RepeatMasker v. 4.1.2-pl (Smit et al. 2013) and were masked prior to annotation. Scaffolds with <10 kb of unmasked sequence were considered highly repetitive and removed from the assembly.

cDNA Sequencing and Genome Annotation

Total RNA was collected from leaf, crown, and inflorescence tissues under greenhouse, salt stress, and cold stress treatments and extracted using the DirectZol total RNA extraction kit (Zymo Research, Irvine, CA, USA). Three 8M SMRT cells were sequenced on a PacBio Sequel II instrument at the BYU DNASC. Full-length transcripts were obtained from HiFi reads and the Iso-Seq pipeline including primer removal using lima v. 2.0 (https://github.com/pacificbiosciences/barcoding/), clustering using IsoSeq3 v. 3.4.0 (https://github.com/PacificBiosciences/IsoSeq), mapping to the assembly using pbmm2 v. 1.8.0 (https://github.com/PacificBiosciences/pbmm2/) and collapsing using IsoSeq3. Genes were annotated with MAKER2 v. 3.01 (Holt and Yandell 2011) using the Iso-Seq clusters as transcript evidence as well as coding and peptide sequences from barley cv. “Morex” V3 (Mascher et al. 2021), wheat cv. “Chinese Spring” RefSeq v2.1 (Zhu et al. 2021) and Brachypodium v3.1 (Vogel et al. 2010). Ab initio gene annotation was performed using P. annua-specific AUGUSTUS v. 3.2 (Stanke and Morgenstern 2005) gene prediction models and rice (Oryza sativa L.) SNAP (Korf 2004) gene models provided to MAKER2 with tRNA genes predicted by tRNAscan-SE (Lowe and Eddy 1997). Gene models were functionally annotated based on sequence homology with the UniProtKB/Swiss-Prot database (https://www.uniprot.org/; downloaded January 2022). Transcriptomes were evaluated for completeness using BUSCO as above, except in transcriptome mode. The distribution of genes and GC content as well as Ty1/Copia- and Gypsy/DIRS1-like repeats from the repeats analysis above was obtained using BEDTools in the same manner as counting telomeres described above and plotted (fig. 1A) using Circa (https://omgenomics.com/circa/).

Subgenome Identification and Characterization

The 14 P. annua chromosomes were sorted into the 2 subgenome types, P. infirma and P. supina, based on the species with the greatest proportion of primary alignments from the cDNA library mapping described above. To compare transcriptome annotations between subgenomes, a dotplot was created using the CoGe SynMap2 tool (https://genomevolution.org/SynMap.pl) to obtain DAGchainer results. The GFF file from the MAKER2 annotation (above) and the DAGChainer file was simplified using a custom script and used by MCScanX (Wang et al. 2012) to produce a collinearity file. The simplified GFF and the collinearity file were uploaded to SynVisio (https://synvisio.github.io/#/Bandi and Gutwin 2020) to visualize annotations between subgenomes (fig. 1D). Synteny based on genome sequence was identified by mapping the subgenomes to each other using the -x asm5 preset option of minimap2, sorting and indexing the bam file using samtools v. 1.9 (Danecek et al. 2021) and feeding the bam file to SyRI v. 1.6 (Goel et al. 2019) to visualize syntenic blocks (fig. 1C). To compare subgenome sequence characteristics to the physical chromosomes, the karyotype of PI 595837 (fig. 1B) was prepared using previously published procedures (Jellen 2016).

Supplementary Material

evac180_Supplementary_Data

Click here for additional data file.^{(168.6KB, xlsx)}

Acknowledgments

This work was supported by base funds from the USDA Agricultural Research Service and by the SCINet project and the AI Center of Excellence of the USDA Agricultural Research Service, ARS project number 0500-00093-001-00-D. Ed Wilcox and the BYU DNASC (RRID: SCR_017781) provided sequencing support.

Contributor Information

Matthew D Robbins, USDA ARS, Forage and Range Research, Logan, Utah.

B Shaun Bushman, USDA ARS, Forage and Range Research, Logan, Utah.

David R Huff, Department of Plant Science, Pennsylvania State University, University Park.

Christopher W Benson, Department of Plant Science, Pennsylvania State University, University Park.

Scott E Warnke, USDA ARS, Floral and Nursery Plants Research, Beltsville, Maryland.

Chase A Maughan, Plant and Wildlife Sciences Department, Brigham Young University, Provo, Utah.

Eric N Jellen, Plant and Wildlife Sciences Department, Brigham Young University, Provo, Utah.

Paul G Johnson, Plant, Soils, and Climate Department, Utah State University, Logan.

Peter J Maughan, Plant and Wildlife Sciences Department, Brigham Young University, Provo, Utah.

Supplementary Material

Supplementary Materials online are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).

Data availability

PacBio, Omni-C, and Iso-Seq reads are available in GenBank (https://www.ncbi.nlm.nih.gov/genbank/ last accessed December 16, 2022) under BioProject PRJNA841947 with SRA accession numbers SRR19374716-24. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession JAPMLF000000000. The version described in this manuscript is version JAPMLF010000000. The assembly and annotation are publicly available on WeedPedia (required user account obtained at https://www.weedgenomics.org/weedpedia/ last accessed December 16, 2022) and on CoGe (https://genomevolution.org/coge/ last accessed December 16, 2022) under Genome ID 63982 (https://genomevolution.org/coge/GenomeView.spl?gid=63982&tracks=sequence%2Cfeatures last accessed December 16, 2022).

Literature Cited

Bandi V, Gutwin C. 2020. Interactive exploration of genomic conservation. Proceedings of the Graph Interface. 2020 May.
Bennett MD. 1972. Nuclear DNA content and minimum generation time in herbaceous plants. Proc R Soc Lond Ser B Biol Sci. 181:109–135. [DOI] [PubMed] [Google Scholar]
Camacho C, et al. 2009. BLAST+: architecture and applications. BMC Bioinformatics 10:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Challis R, Richards E, Rajan J, Cochrane G, Blaxter M. 2020. BlobToolKit—interactive quality assessment of genome assemblies. G3 (Bethesda) 10:1361–1374. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen S, McElroy JS, Dane F, Goertzen LR. 2016. Transcriptome assembly and comparison of an allotetraploid weed species, annual bluegrass, with its two diploid progenitor species, Poa supina Schrad and Poa infirma Kunth. Plant Genome 9:plantgenome2015.06.0050. [DOI] [PubMed] [Google Scholar]
Cheng H, Concepcion GT, Feng X, Zhang H, Li H. 2021. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 18:170–175. [DOI] [PMC free article] [PubMed] [Google Scholar]
Danecek P, et al. 2021. Twelve years of SAMtools and BCFtools. Gigascience 10:1–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ellis WM, Lee BTO, Calder DM. 1971. A biometric analysis of populations of Poa annua L. Evolution 25:29–37. [DOI] [PubMed] [Google Scholar]
Flynn JM, et al. 2020. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A. 117:9451–9457. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gillespie LJ, Soreng RJ. 2005. A phylogenetic analysis of the bluegrass genus Poa based on cpDNA restriction site data. Syst Bot. 30:84–105. [Google Scholar]
Goel M, Sun H, Jiao WB, Schneeberger K. 2019. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20:1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
Heide OM. 2001. Flowering responses of contrasting ecotypes of Poa annua and their putative ancestors Poa infirma and Poa supina. Ann Bot. 87:795–804. [Google Scholar]
Holt C, Yandell M. 2011. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
Huff DR. 2003. Annual bluegrass (Poa annua L). In: Casler MD, Duncan RR, editors. Turfgrass biology, genetics, and breeding. Hoboken: (NJ: ): John Wiley & Sons Inc. p. 39–51. [Google Scholar]
Jellen EN. 2016. C-banding of plant chromosomes. Methods Mol Biol. 1429:1–5. [DOI] [PubMed] [Google Scholar]
Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. 2019. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 37:907–915. [DOI] [PMC free article] [PubMed] [Google Scholar]
Korf I. 2004. Gene finding in novel genomes. BMC Bioinformatics 5:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Koshy TK. 1968. Evolutionary origin of Poa annua L. in the light of karyotypic studies. Can J Genet Cytol. 10:112–118. [Google Scholar]
Kozik A, et al. 2019. The alternative reality of plant mitochondrial DNA: one ring does not rule them all. PLoS Genet. 15:e1008373. [DOI] [PMC free article] [PubMed] [Google Scholar]
La Mantia JM, Huff DR. 2011. Instability of the greens-type phenotype in Poa annua L. Crop Sci. 51:1784–1792. [Google Scholar]
Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25:955. [DOI] [PMC free article] [PubMed] [Google Scholar]
Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. 2021. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 38:4647–4654. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mao Q, Huff DR. 2012. The evolutionary origin of Poa annua L. Crop Sci. 52:1910–1922. [Google Scholar]
Marçais G, Kingsford C. 2011. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27:764–770. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mascher M, et al. 2021. Long-read sequence assembly: a technical evaluation in barley. Plant Cell 33:1888–1906. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mikheenko A, Prjibelski A, Saveliev V, Antipov D, Gurevich A. 2018. Versatile genome assembly evaluation with QUAST-LG. Bioinformatics 34:i142–i150. [DOI] [PMC free article] [PubMed] [Google Scholar]
Molina-Montenegro MA, et al. 2012. Occurrence of the non-native annual bluegrass on the Antarctic mainland and its negative effects on native plants. Conserv Biol. 26:717–723. [DOI] [PubMed] [Google Scholar]
Morley SA, Nielsen BL. 2017. Plant mitochondrial DNA. Front Biosci Landmark. 22:1023–1132. [DOI] [PubMed] [Google Scholar]
Mowforth MA, Grime JP. 1989. Intra-population variation in nuclear DNA amount, cell size and growth rate in Poa annua L. Funct Ecol. 3:289. [Google Scholar]
Nannfeldt JA. 1937. The chromosome numbers of Poa sect. Ochlopoa A. & Gr. and their taxonomical significance. Bot Not. 1937:238–254. [Google Scholar]
Palmer JD, et al. 2000. Dynamic evolution of plant mitochondrial genomes: mobile genes and introns and highly variable mutation rates. Proc Natl Acad Sci U S A. 97:6960–6966. [DOI] [PMC free article] [PubMed] [Google Scholar]
Peska V, Garcia S. 2020. Origin, diversity, and evolution of telomere sequences in plants. Front Plant Sci. 11:117. [DOI] [PMC free article] [PubMed] [Google Scholar]
Putnam NH, et al. 2016. Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. Genome Res. 26:342–350. [DOI] [PMC free article] [PubMed] [Google Scholar]
Quinlan AR, Hall IM. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ramachandran D, et al. 2021. Chromosome level genome assembly and annotation of highly invasive Japanese stiltgrass (Microstegium vimineum). Genome Biol Evol. 13:1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ranallo-Benavidez TR, Jaron KS, Schatz MC. 2020. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun. 11:1432. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sharma A, Presting GG. 2014. Evolution of centromeric retrotransposons in grasses. Genome Biol Evol. 6:1335–1352. [DOI] [PMC free article] [PubMed] [Google Scholar]
Smit AF, Hubley R, Green P.. 2013. RepeatMasker Open-4.0. [cited 2022 July 22]. Available from:http://www.repeatmasker.org.
Soreng RJ, Bull RD, Gillespie LJ. 2010. Phylogeny and reticulation in Poa based on plastid trnTLF and nrITS sequences with attention to diploids. In: Seberg O, Peterson G, Barfod A, Davis JI, editors. Diversity, phylogeny and evolution of monocotyledons. Aarhus: Aarhus University Press. p. 619–644. [Google Scholar]
Stanke M, Morgenstern B. 2005. AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res. 33:W465–W467. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tutin TG. 1952. Origin of Poa annua L. Nature 169:160. [Google Scholar]
Tutin TG. 1957. A contribution to the experimental taxonomy of Poa annua L. Watsonia 4:1–10. [Google Scholar]
Vogel JP, et al. 2010. Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463:763–768. [DOI] [PubMed] [Google Scholar]
Wang Y, et al. 2012. MCScanx: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40:e49. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wenger AM, et al. 2019. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 37(10):1155–1162. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wood DE, Lu J, Langmead B. 2019. Improved metagenomic analysis with Kraken 2. Genome Biol. 20:1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhu T, et al. 2021. Optical maps refine the bread wheat Triticum aestivum cv. Chinese Spring genome assembly. Plant J. 107:303–314. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

evac180_Supplementary_Data

Click here for additional data file.^{(168.6KB, xlsx)}

Data Availability Statement

[evac180-B1] Bandi V, Gutwin C. 2020. Interactive exploration of genomic conservation. Proceedings of the Graph Interface. 2020 May.

[evac180-B2] Bennett MD. 1972. Nuclear DNA content and minimum generation time in herbaceous plants. Proc R Soc Lond Ser B Biol Sci. 181:109–135. [DOI] [PubMed] [Google Scholar]

[evac180-B3] Camacho C, et al. 2009. BLAST+: architecture and applications. BMC Bioinformatics 10:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B4] Challis R, Richards E, Rajan J, Cochrane G, Blaxter M. 2020. BlobToolKit—interactive quality assessment of genome assemblies. G3 (Bethesda) 10:1361–1374. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B5] Chen S, McElroy JS, Dane F, Goertzen LR. 2016. Transcriptome assembly and comparison of an allotetraploid weed species, annual bluegrass, with its two diploid progenitor species, Poa supina Schrad and Poa infirma Kunth. Plant Genome 9:plantgenome2015.06.0050. [DOI] [PubMed] [Google Scholar]

[evac180-B6] Cheng H, Concepcion GT, Feng X, Zhang H, Li H. 2021. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 18:170–175. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B7] Danecek P, et al. 2021. Twelve years of SAMtools and BCFtools. Gigascience 10:1–4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B8] Ellis WM, Lee BTO, Calder DM. 1971. A biometric analysis of populations of Poa annua L. Evolution 25:29–37. [DOI] [PubMed] [Google Scholar]

[evac180-B9] Flynn JM, et al. 2020. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A. 117:9451–9457. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B10] Gillespie LJ, Soreng RJ. 2005. A phylogenetic analysis of the bluegrass genus Poa based on cpDNA restriction site data. Syst Bot. 30:84–105. [Google Scholar]

[evac180-B11] Goel M, Sun H, Jiao WB, Schneeberger K. 2019. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20:1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B12] Heide OM. 2001. Flowering responses of contrasting ecotypes of Poa annua and their putative ancestors Poa infirma and Poa supina. Ann Bot. 87:795–804. [Google Scholar]

[evac180-B13] Holt C, Yandell M. 2011. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B14] Huff DR. 2003. Annual bluegrass (Poa annua L). In: Casler MD, Duncan RR, editors. Turfgrass biology, genetics, and breeding. Hoboken: (NJ: ): John Wiley & Sons Inc. p. 39–51. [Google Scholar]

[evac180-B15] Jellen EN. 2016. C-banding of plant chromosomes. Methods Mol Biol. 1429:1–5. [DOI] [PubMed] [Google Scholar]

[evac180-B16] Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. 2019. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 37:907–915. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B17] Korf I. 2004. Gene finding in novel genomes. BMC Bioinformatics 5:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B18] Koshy TK. 1968. Evolutionary origin of Poa annua L. in the light of karyotypic studies. Can J Genet Cytol. 10:112–118. [Google Scholar]

[evac180-B19] Kozik A, et al. 2019. The alternative reality of plant mitochondrial DNA: one ring does not rule them all. PLoS Genet. 15:e1008373. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B20] La Mantia JM, Huff DR. 2011. Instability of the greens-type phenotype in Poa annua L. Crop Sci. 51:1784–1792. [Google Scholar]

[evac180-B21] Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B22] Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25:955. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B23] Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. 2021. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 38:4647–4654. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B24] Mao Q, Huff DR. 2012. The evolutionary origin of Poa annua L. Crop Sci. 52:1910–1922. [Google Scholar]

[evac180-B25] Marçais G, Kingsford C. 2011. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27:764–770. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B26] Mascher M, et al. 2021. Long-read sequence assembly: a technical evaluation in barley. Plant Cell 33:1888–1906. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B27] Mikheenko A, Prjibelski A, Saveliev V, Antipov D, Gurevich A. 2018. Versatile genome assembly evaluation with QUAST-LG. Bioinformatics 34:i142–i150. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B28] Molina-Montenegro MA, et al. 2012. Occurrence of the non-native annual bluegrass on the Antarctic mainland and its negative effects on native plants. Conserv Biol. 26:717–723. [DOI] [PubMed] [Google Scholar]

[evac180-B29] Morley SA, Nielsen BL. 2017. Plant mitochondrial DNA. Front Biosci Landmark. 22:1023–1132. [DOI] [PubMed] [Google Scholar]

[evac180-B30] Mowforth MA, Grime JP. 1989. Intra-population variation in nuclear DNA amount, cell size and growth rate in Poa annua L. Funct Ecol. 3:289. [Google Scholar]

[evac180-B31] Nannfeldt JA. 1937. The chromosome numbers of Poa sect. Ochlopoa A. & Gr. and their taxonomical significance. Bot Not. 1937:238–254. [Google Scholar]

[evac180-B32] Palmer JD, et al. 2000. Dynamic evolution of plant mitochondrial genomes: mobile genes and introns and highly variable mutation rates. Proc Natl Acad Sci U S A. 97:6960–6966. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B33] Peska V, Garcia S. 2020. Origin, diversity, and evolution of telomere sequences in plants. Front Plant Sci. 11:117. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B34] Putnam NH, et al. 2016. Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. Genome Res. 26:342–350. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B35] Quinlan AR, Hall IM. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B36] Ramachandran D, et al. 2021. Chromosome level genome assembly and annotation of highly invasive Japanese stiltgrass (Microstegium vimineum). Genome Biol Evol. 13:1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B37] Ranallo-Benavidez TR, Jaron KS, Schatz MC. 2020. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun. 11:1432. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B38] Sharma A, Presting GG. 2014. Evolution of centromeric retrotransposons in grasses. Genome Biol Evol. 6:1335–1352. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B39] Smit AF, Hubley R, Green P.. 2013. RepeatMasker Open-4.0. [cited 2022 July 22]. Available from:http://www.repeatmasker.org.

[evac180-B40] Soreng RJ, Bull RD, Gillespie LJ. 2010. Phylogeny and reticulation in Poa based on plastid trnTLF and nrITS sequences with attention to diploids. In: Seberg O, Peterson G, Barfod A, Davis JI, editors. Diversity, phylogeny and evolution of monocotyledons. Aarhus: Aarhus University Press. p. 619–644. [Google Scholar]

[evac180-B41] Stanke M, Morgenstern B. 2005. AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res. 33:W465–W467. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B42] Tutin TG. 1952. Origin of Poa annua L. Nature 169:160. [Google Scholar]

[evac180-B43] Tutin TG. 1957. A contribution to the experimental taxonomy of Poa annua L. Watsonia 4:1–10. [Google Scholar]

[evac180-B44] Vogel JP, et al. 2010. Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463:763–768. [DOI] [PubMed] [Google Scholar]

[evac180-B45] Wang Y, et al. 2012. MCScanx: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40:e49. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B46] Wenger AM, et al. 2019. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 37(10):1155–1162. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B47] Wood DE, Lu J, Langmead B. 2019. Improved metagenomic analysis with Kraken 2. Genome Biol. 20:1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]

[evac180-B48] Zhu T, et al. 2021. Optical maps refine the bread wheat Triticum aestivum cv. Chinese Spring genome assembly. Plant J. 107:303–314. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Chromosome-Scale Genome Assembly and Annotation of Allotetraploid Annual Bluegrass (Poa annua L.)

Matthew D Robbins

B Shaun Bushman

David R Huff

Christopher W Benson

Scott E Warnke

Chase A Maughan

Eric N Jellen

Paul G Johnson

Peter J Maughan

Roles

Abstract

Significance.

Introduction

Results and Discussion

Genome Sequencing and Assembly

Fig. 1.

Repeat Analysis and Gene Annotation

Table 1.

Poa annua Subgenomes

Materials and Methods

Sequencing and Assembly

Identification of Contamination, Organellar Sequence, and Repeats

cDNA Sequencing and Genome Annotation

Subgenome Identification and Characterization

Supplementary Material

Acknowledgments

Contributor Information

Supplementary Material

Data availability

Literature Cited

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Chromosome-Scale Genome Assembly and Annotation of Allotetraploid Annual Bluegrass (Poa annua L.)

Matthew D Robbins

B Shaun Bushman

David R Huff

Christopher W Benson

Scott E Warnke

Chase A Maughan

Eric N Jellen

Paul G Johnson

Peter J Maughan

Roles

Abstract

Significance.

Introduction

Results and Discussion

Genome Sequencing and Assembly

Fig. 1.

Repeat Analysis and Gene Annotation

Table 1.

Poa annua Subgenomes

Materials and Methods

Sequencing and Assembly

Identification of Contamination, Organellar Sequence, and Repeats

cDNA Sequencing and Genome Annotation

Subgenome Identification and Characterization

Supplementary Material

Acknowledgments

Contributor Information

Supplementary Material

Data availability

Literature Cited

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases