. 2015 Nov 8;44(Database issue):D733–D745. doi: 10.1093/nar/gkv1189

Table 1. RefSeq accession prefixes.

Prefix	Molecule type	Use context
NC_¹	DNA	Chromosomes
		Linkage Groups
AC_¹	DNA	Chromosomes
		Linkage Groups
NZ_²	DNA	Chromosomes
		Scaffolds
		Used predominantly for prokaryotic genomes.
NT_³	DNA	Scaffolds
NW_³	DNA	Scaffolds
NG_¹	DNA	Genomic regions.
		A genomic region record may represent a single or multiple genetic loci (e.g. rRNA targeted locus, RefSeqGene, non-transcribed pseudogene)
NM_^3,4	mRNA	protein-coding transcripts
XM_^3,5	mRNA	protein-coding transcripts
NR_^3,4	RNA	non-protein-coding transcripts including lncRNAs, structural RNAs, transcribed pseudogenes, and transcripts with unlikely protein-coding potential from protein-coding genes
XR_^3,5	RNA	non-protein-coding transcripts, as above
NP_^3,4	protein	Proteins annotated on NM_ transcript accessions or annotated on genomic molecules without an instantiated transcript (e.g. some mitochondrial genomes, viral genomes, and reference bacterial genomes
AP_³	protein	Proteins annotated on AC_ genomic accessions or annotated on genomic molecules without an instantiated transcript record
XP_^3,5	protein	Proteins annotated on XM_ transcript accessions or annotated on genomic molecules without an instantiated transcript record
YP_³	protein	Proteins annotated on genomic molecules without an instantiated transcript record
WP_⁶	protein	Proteins that are non-redundant across multiple strains and species. A single protein of this type may be annotated on more than one prokaryotic genome

¹The complete accession number format consists of the prefix, including the underscore, followed by 6 numbers followed by the sequence version number.

²The complete accession format consists of the prefix followed by the INSDC accession number that the RefSeq record is based on followed by the RefSeq sequence version number.

³The complete accession number format consists of the prefix, including the underscore, followed by 6 or 9 numbers followed by the sequence version number.

⁴Records with this accession prefix have been curated by NCBI staff or a model organism database, or are in the pool of accessions that curators work with. These records are referred to as the ‘known’ RefSeq dataset.

⁵Records with this accession prefix are generated through either the eukaryotic genome annotation pipeline, or the small eukaryotic genome annotation pipeline. Records generated via the first method are referred to as the ‘model’ RefSeq dataset.

⁶The complete accession number format consists of the prefix, including the underscore, followed by 9 numbers followed by the version number. The version number is always ‘.1’ as these records are not subject to update. See online documentation for additional information: www.ncbi.nlm.nih.gov/refseq/about/nonredundantproteins/.