Skip to main content
. 2025 Aug 22;19:97. doi: 10.1186/s40246-025-00811-z

Table 2.

Gene model options

Gene Model Strengths Weaknesses
ENSEMBL Reliable cross-species gene annotation via combined manual & automated methods; supports transcript diversity & comparative genomics; regularly updated with VEP & BioMart links; core genome browser support. Annotation varies by species; complex transcripts could be inconsistently modelled; dependent on quality of assembly genome
GENCODE High-quality gene annotation for human/mouse; includes lncRNAs, pseudogenes, & transcripts; integrates manual curation & automation; captures transcript diversity; used within Ensembl, RefSeq, & UCSC Incomplete experimental support for all transcripts; redundancy & unclear function in many lncRNAs / pseudogenes; manual curation limits scalability; inter-version differences may affect coordinate tracking.
RefSeq

High-quality, curated reference sequences for genomic, transcript, & protein data; consistent across species annotations; integrate

manual curation with scalable automation;

widely adopted in tools like VEP, ANNOVAR, GATK.

RefSeq tends to be conservative and includes fewer transcript isoforms; RefSeq updates less frequently than other models; RefSeq is centrally managed by NCBI and no community input
UCSC

Curated gene models from mRNA/protein alignments enhances RNA-seq quantification; emphasizes

reliable transcripts & simplifies isoform sets for reproducible gene counts; integrated with UCSC Genome Browser.

Limited transcript diversity & isoforms & non-coding RNAs.

Fewer splice junctions reduce RNA-seq accuracy;

biased toward canonical genes

Uniprot Provides detailed protein-level annotation including domains, function, and subcellular localization Does not directly annotate variants or regulatory elements; protein-focused