Skip to main content
. Author manuscript; available in PMC: 2012 Jun 21.
Published in final edited form as: Nat Rev Microbiol. 2011 May 9;9(6):467–477. doi: 10.1038/nrmicro2577

Table 2.

Classification and nomenclature of CRISPR-associated genes*

Proposed gene name System type or subtype Name from Haft et al.§ Name from Brouns et al. Structure of encoded protein (PDB accessions) Families (and superfamily) of encoded protein#** Representatives
cas1 • Type I cas1 cas1 3GOD,3LFX and 2YZS COG1518 SERP2463,SPy1047 and ygbT
• Type II
• Type III

cas2 • Type I cas2 cas2 2IVY,2I8E and 3EXC COG1343 and COG3512 SERP2462, SPy1048, SPy1723 (N-terminal domain) and ygbF
• Type II
• Type III

cas3 • Type I‡‡ cas3 cas3 NA COG1203 APE1232 and ygcB

cas3 • Subtype I-A NA NA NA COG2254 APE1231 and BH0336
• Subtype I-B

cas4 • Subtype I-A cas4 and csa1 NA NA COG1468 APE1239 and BH0340
• Subtype I-B
• Subtype I-C
• Subtype I-D
• Subtype II-B

cas5 • Subtype I-A cas5a, cas5d, cas5e, cas5h, cas5p, cas5t and cmx5 casD 3KG4 COG1688 (RAMP) APE1234, BH0337, devS and ygcI
• Subtype I-B
• Subtype I-C
• Subtype I-E

cas6 • Subtype I-A cas6 and cmx6 NA 3I4H COG1583 and COG5551 (RAMP) PF1131 and slr7014
• Subtype I-B
• Subtype I-D
• Subtype III-A
•Subtype III-B

cas6e • Subtype I-E cse3 casE 1WJ9 (RAMP) ygcH

cas6f • Subtype I-F csy4 NA 2XLJ (RAMP) y1727

cas7 • Subtype I-A csa2,csd2,cse4, csh2, csp1 and cst2 casC NA COG1857 and COG3649 (RAMP) devR and ygcJ
• Subtype I-B
• Subtype I-C
• Subtype I-E

cas8a1 • Subtype I-A‡‡ cmx1, cst1, csx8, csx13 and CXXC-CXXC NA NA BH0338-like LA3191§§and PG2018§§

cas8a2 • Subtype I-A‡‡ csa4 and csx9 NA NA PH0918 AF0070,AF1873, MJ0385,PF0637,PH0918 and SSO1401

cas8b • Subtype I-B‡‡ csh1 and TM1802 NA NA BH0338-like MTH1090 and TM1802

cas8c • Subtype I-C‡‡ csd1 and csp2 NA NA BH0338-like BH0338

cas9 • Type II‡‡ csn1 and csx12 NA NA COG3513 FTN_0757 and SPy1046

cas10 • Type III‡‡ cmr2, csm1 and csx1 1 NA NA COG1353 MTH326,Rv2823c§§ and TM 1794§§

cas10d • Subtype I-D‡‡ csc3 NA NA COG1353 slr7011

csy1 • Subtype I-F‡‡ csy1 NA NA y1724-like y1724

csy2 • Subtype I-F csy2 NA NA (RAMP) y1725

csy3 • Subtype I-F csy3 NA NA (RAMP) y1726

cse1 • Subtype I-E‡‡ cse1 casA NA YgcL-like ygcL

cse2 • Subtype I-E cse2 casB 2ZCA YgcK-like ygcK

csc1 • Subtype I-D csc1 NA NA alr1563-like(RAMP) alr1563

csc2 • Subtype I-D csc1 and csc2 NA NA COG1337(RAMP) slr7012

csa5 • Subtype I-A csa5 NA NA AF1870 AF1870,MJ0380,PF0643 and SSO1398

csn2 • Subtype II-A csn2 NA NA SPy1049-like SPy1049

csm2 • Subtype III-A‡‡ csm2 NA NA COG1421 MTH1081 and SERP2460

csm3 • Subtype III-A csc2 and csm3 NA NA COG1337(RAMP) MTH1080 and SERP2459

csm4 • Subtype III-A csm4 NA NA COG1567(RAMP) MTH1079 and SERP2458
csm5 • Subtype III-A csm5 NA NA COG1332(RAMP) MTH1078 and SERP2457

csm6 • Subtype III-A APE2256 and csm6 NA 2WTE COG1517 APE2256 and SSO1445

cmr1 • Subtype III-B cmr1 NA NA COG1367(RAMP) PF1130

cmr3 • Subtype III-B cmr3 NA NA COG1769(RAMP) PF1128

cmr4 • Subtype III-B cmr4 NA NA COG1336(RAMP) PF1126

cmr5 • Subtype III-B‡‡ cmr5 NA 2ZOP and 2OEB COG3337 MTH324 and PF1125

cmr6 • Subtype III-B cmr6 NA NA COG1604(RAMP) PF1124

csb1 • Subtype I-U GSU0053 NA NA (RAMP) Balac_1306 and GSU0053

csb2 • Subtype I-U§§ NA NA NA (RAMP) Balac 1305 and GSU0054

csb3 • Subtype I-U NA NA NA (RAMP) Balac_1303#

csx17 • Subtype I-U NA NA NA NA Btus_2683

csx14 • Subtype I-U NA NA NA NA GSU0052

csx10 • Subtype I-U csx10 NA NA (RAMP) Caur_2274

csx16 • Subtype III-U VVA1 548 NA NA NA VVA1 548

csaX • Subtype III-U csaX NA NA NA SSO1438

csx3 • Subtype III-U csx3 NA NA NA AF1864

csx1 • Subtype III-U csa3,csx1,csx2,DXTHG, NE0113 and TIGR02710 NA 1XMX and 2I71 COG1517 and COG4006 MJ1666, NE0113, PF1127 and TM1812

csx15 • Unknown NA NA NA TTE2665 TTE2665

csf1 • Type U csf1 NA NA NA AFE_1038

csf2 • Type U csf2 NA NA (RAMP) AFE_1039

csf3 • Type U csf3 NA NA (RAMP) AFE_1040

csf4 • Type U csf4 NA NA NA AFE_1037

N, amino; NA, not applicable; RAMP, repeat-associated mysterious protein.

*

Includes the names of all genes that have been shown to function within the CRISPR–Cas (clustered regularly interspaced short palindromic repeats–CRISPR-associated proteins) systems and/or are associated with CRISPR–cas loci in diverse genomes. Genes that are associated with CRISPR–cas loci in only one or a few closely related genomes are not included. Subsequent to their original publication13, Haft et al. introduced a number of new types of CRISPR-Cas systems as well as gene names that are included in the TIGRFAMs database50 but mostly fit into previously described gene and protein families.

The updated TIGRFAMs identifiers are given in Supplementary information S4 (table). The csx names are temporarily given to cas genes that cannot be confidently included in any of the large cas families but are currently not characterized in sufficient detail to rule out the possibility of such assignments in the future. Beginning with release 10.1 (ftp://ftp.jcvi.org/pub/data/TIGRFAMs/), the hidden Markov model (HMM)-based classifiers in TIGRFAMs assign polythetic names reflecting the nomenclature changes described here while retaining the narrower protein family granularities of the original nomenclature13.

§

See REF. 13. Most of the families correspond to those proposed by Makarova et al.14, with a few changes and additions.

See REF. 24.

All available structures are listed; see the Protein Data Bank (PDB).

#

Tentative predictions based on weak sequence similarity, sequence length and gene order in an operon.

**

See the clusters of orthologous groups of proteins (COGs) database.

‡‡

These are signature genes for these CRISPR-Cas system types and subtypes.

§§

Unclassified.

HHS Vulnerability Disclosure