Skip to main content
. Author manuscript; available in PMC: 2016 Jan 21.
Published in final edited form as: Immunity. 2015 Jan 20;42(1):18–27. doi: 10.1016/j.immuni.2015.01.004

Table 1.

Number and frequency of N20NGG, unique N20NGG, N13NGG and unique N12NGG sequence in the mouse genome.

Sequence Number of site Frequency in mouse genome (number of sites/number of nucleotides)
N20NGG 274,072,490 1/10
Unique N20NGG 204,035,213 1/13
Unique N13NGG 11,369,632 1/240
Unique N12NGG 5,274,838 1/517

Mouse genome length is 2,725,765,481 nucleotides. This table was generated by downloading the mouse genome (mm9, NCBI Build 37) in FASTA format (chromFa.tar.gz) from http://hgdownload.soe.ucsc.edu/goldenPath/mm9/bigZips/. All possible 20-mers followed by a PAM sequence (NGG) were extracted with the fuzznuc tool from the EMBOSS Suite (Rice et al., 2000). A Python script was used to convert the fuzznuc outputs to BED files as well as extracting all 20-, 13- or 12-mers to separate files, with one site per line. A pipeline of the UNIX command “sort” and “uniq” was used to return unique lines.