Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2017 May 5;6(6):1–5. doi: 10.1093/gigascience/gix033

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Authors 2017. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Figure 2: — An overview of the annotation workflow. The workflow begins with assembled genomic sequences, and it produces results of the repeat annotation, protein-coding gene prediction, and functional annotation. (a) Repeat annotation: repeats in the genome are detected in two different methods: de novo and homolog based. In the de novo method, RepeatScout, LTR-FINDER, and RepeatModeler are used to build de novo repeat libraries and further classified by RepeatMasker. In the homolog-based method, RepeatMasker and RepeatProteinMask are performed to search TEs by aligning sequences against existing libraries. (b) Gene prediction: before the gene prediction, TEs are totally masked. Augustus and GlimmerHMM are used to perform de novo prediction; BLAT and GeneWise are executed to predict gene models based on homologous protein sequences. (c) GLEAN is performed to obtain a consensus gene set. (d) In combination with the clean RNA sequenced reads, a more comprehensive gene set is integrated finally. (e) Estimation of the completeness of the gene set using BUSCO. (f) Functional annotation.