Skip to main content
. 2021 Apr 19;17(4):e1008926. doi: 10.1371/journal.pcbi.1008926

Table 1. Alignment statistics for the datasets used in this study.

Multireads
Sample Genome Assay Cell Line Unireads Analyzable Unanalyzable % Increase
Simulated, 50bp hg38 Simulation 245,079,644 34,136,124 7,661,326 13.93%
Simulated, -k 101 hg38 Simulation 244,391,815 35,520,969* 6,973,053* 14.53%
Simulated, 100bp hg38 Simulation 123,730,306 16,769,189 2,802,056 13.55%
AR7 Input Rep. 1 mm10* MNAse-seq mESC E14 311,090,692 85,018,787 15,184,872 27.33%
H3K4me3 Rep. 1 mm10* ChIP-seq mESC E14 119,014,494 19,662,529 5,603,383 16.52%
Input Rep. 2 mm10* ChIP-seq mESC E14 304,127,899 83,629,528 17,160,012 27.50%
H3K4me3 Rep. 2 mm10* ChIP-seq mESC E14 91,518,104 14,549,072 4,657,032 15.90%
AR8 Input dm3 MNAse-seq S2 18,678,956 7,117,520 977,776 38.10%
H3K27me3 dm3 ChIP-seq S2 8,855,114 3,249,005 389,227 36.69%
AR9 Input mm10 MNAse-seq mESC E14 488,503,092 131,960,514 26,577,525 27.01%
H3K4me3 mm10 ChIP-seq mESC E14 169,335,369 32,089,449 7,918,756 18.95%
H3K9me3 mm10 ChIP-seq mESC E14 136,008,760 73,118,061 13,012,319 53.76%
H3K27me3 mm10 ChIP-seq mESC E14 155,322,021 43,508,387 9,267,806 28.01%
AR16 Input hg38 MNAse-seq K562 285,996,344 56,595,547 12,902,707 19.79%
H3K4me1 hg38 ChIP-seq K562 92,422,802 16,475,108 2,434,216 17.83%
H3K4me2 hg38 ChIP-seq K562 70,987,452 12,931,282 2,558,979 18.22%
H3K4me3 hg38 ChIP-seq K562 40,483,145 5,488,996 803,892 13.56%
AR17 Input hg38 MNAse-seq K562 256,373,920 48,634,887 11,216,500 18.97%
H3K9me3 hg38 ChIP-seq K562 193,011,406 40,618,196 10,337,413 21.04%
H3K27me3 hg38 ChIP-seq K562 173,915,939 32,770,085 7,107,199 18.84%
ENCODE Snyder Rep. 1 hg38 ATAC-seq K562 32,995,935 6,484,894 299,834 19.65%
Snyder Rep. 2 hg38 ATAC-seq K562 24,414,870 4,210,386 149,154 17.25%
Gingeras Rep. 1 hg38§ RNA-seq K562 60,184,580 20,651,064 29,231 34.31%
Gingeras Rep. 2 hg38§ RNA-seq K562 63,238,387 13,087,755 14,070 20.70%

For all datasets, Unireads refers to the number of reads with one alignment.

For all except the “Simulated, -k 101” dataset, Analyzable Multireads refers to reads with between 2–50 alignments; Unanalyzable Multireads refers to reads with 51 reported alignments, the limit for reported alignments per read.

For the “Simulated, -k 101” dataset, Analyzable Multireads refers to reads with 2–100 alignments, and Unanalyzable Multireads refers to reads with 101 reported alignments.

% Increase: Increase in the number of analyzable reads with SmartMap analysis, computed as the number of Analyzable Multireads as a percentage of the number of Unireads.

Genome includes ICeChIP barcodes:

* Series 1.

Series 2.

Series 3.

§ Genome includes ENCODE ERCC standards.