Table 1. Alignment statistics for the datasets used in this study.
Multireads | ||||||||
---|---|---|---|---|---|---|---|---|
Sample | Genome | Assay | Cell Line | Unireads | Analyzable | Unanalyzable | % Increase | |
Simulated, 50bp | hg38 | Simulation | – | 245,079,644 | 34,136,124 | 7,661,326 | 13.93% | |
Simulated, -k 101 | hg38 | Simulation | – | 244,391,815 | 35,520,969* | 6,973,053* | 14.53% | |
Simulated, 100bp | hg38 | Simulation | – | 123,730,306 | 16,769,189 | 2,802,056 | 13.55% | |
AR7 | Input Rep. 1 | mm10* | MNAse-seq | mESC E14 | 311,090,692 | 85,018,787 | 15,184,872 | 27.33% |
H3K4me3 Rep. 1 | mm10* | ChIP-seq | mESC E14 | 119,014,494 | 19,662,529 | 5,603,383 | 16.52% | |
Input Rep. 2 | mm10* | ChIP-seq | mESC E14 | 304,127,899 | 83,629,528 | 17,160,012 | 27.50% | |
H3K4me3 Rep. 2 | mm10* | ChIP-seq | mESC E14 | 91,518,104 | 14,549,072 | 4,657,032 | 15.90% | |
AR8 | Input | dm3† | MNAse-seq | S2 | 18,678,956 | 7,117,520 | 977,776 | 38.10% |
H3K27me3 | dm3† | ChIP-seq | S2 | 8,855,114 | 3,249,005 | 389,227 | 36.69% | |
AR9 | Input | mm10† | MNAse-seq | mESC E14 | 488,503,092 | 131,960,514 | 26,577,525 | 27.01% |
H3K4me3 | mm10† | ChIP-seq | mESC E14 | 169,335,369 | 32,089,449 | 7,918,756 | 18.95% | |
H3K9me3 | mm10† | ChIP-seq | mESC E14 | 136,008,760 | 73,118,061 | 13,012,319 | 53.76% | |
H3K27me3 | mm10† | ChIP-seq | mESC E14 | 155,322,021 | 43,508,387 | 9,267,806 | 28.01% | |
AR16 | Input | hg38‡ | MNAse-seq | K562 | 285,996,344 | 56,595,547 | 12,902,707 | 19.79% |
H3K4me1 | hg38‡ | ChIP-seq | K562 | 92,422,802 | 16,475,108 | 2,434,216 | 17.83% | |
H3K4me2 | hg38‡ | ChIP-seq | K562 | 70,987,452 | 12,931,282 | 2,558,979 | 18.22% | |
H3K4me3 | hg38‡ | ChIP-seq | K562 | 40,483,145 | 5,488,996 | 803,892 | 13.56% | |
AR17 | Input | hg38‡ | MNAse-seq | K562 | 256,373,920 | 48,634,887 | 11,216,500 | 18.97% |
H3K9me3 | hg38‡ | ChIP-seq | K562 | 193,011,406 | 40,618,196 | 10,337,413 | 21.04% | |
H3K27me3 | hg38‡ | ChIP-seq | K562 | 173,915,939 | 32,770,085 | 7,107,199 | 18.84% | |
ENCODE | Snyder Rep. 1 | hg38 | ATAC-seq | K562 | 32,995,935 | 6,484,894 | 299,834 | 19.65% |
Snyder Rep. 2 | hg38 | ATAC-seq | K562 | 24,414,870 | 4,210,386 | 149,154 | 17.25% | |
Gingeras Rep. 1 | hg38§ | RNA-seq | K562 | 60,184,580 | 20,651,064 | 29,231 | 34.31% | |
Gingeras Rep. 2 | hg38§ | RNA-seq | K562 | 63,238,387 | 13,087,755 | 14,070 | 20.70% |
For all datasets, Unireads refers to the number of reads with one alignment.
For all except the “Simulated, -k 101” dataset, Analyzable Multireads refers to reads with between 2–50 alignments; Unanalyzable Multireads refers to reads with 51 reported alignments, the limit for reported alignments per read.
For the “Simulated, -k 101” dataset, Analyzable Multireads refers to reads with 2–100 alignments, and Unanalyzable Multireads refers to reads with 101 reported alignments.
% Increase: Increase in the number of analyzable reads with SmartMap analysis, computed as the number of Analyzable Multireads as a percentage of the number of Unireads.
Genome includes ICeChIP barcodes:
* Series 1.
† Series 2.
‡ Series 3.
§ Genome includes ENCODE ERCC standards.