Table 1.
Sequence | Number of site | Frequency in mouse genome (number of sites/number of nucleotides) |
---|---|---|
N20NGG | 274,072,490 | 1/10 |
Unique N20NGG | 204,035,213 | 1/13 |
Unique N13NGG | 11,369,632 | 1/240 |
Unique N12NGG | 5,274,838 | 1/517 |
Mouse genome length is 2,725,765,481 nucleotides. This table was generated by downloading the mouse genome (mm9, NCBI Build 37) in FASTA format (chromFa.tar.gz) from http://hgdownload.soe.ucsc.edu/goldenPath/mm9/bigZips/. All possible 20-mers followed by a PAM sequence (NGG) were extracted with the fuzznuc tool from the EMBOSS Suite (Rice et al., 2000). A Python script was used to convert the fuzznuc outputs to BED files as well as extracting all 20-, 13- or 12-mers to separate files, with one site per line. A pipeline of the UNIX command “sort” and “uniq” was used to return unique lines.