Table 2.
Experimental data for human and mouse assemblies
Sequence coverage, × |
|||||||||||
Species | Library type | No. of libraries | DNA used, μg | Mean size, bp | Read length | All | PF | Aligned | Unique | Valid | Physical coverage, × |
Human | Fragment | 1 | 3 | 155 | 101 | 51.9 | 41.8 | 38.4 | 37.9 | 36.5 | 27.8 |
Short jump | 2 | 20 | 2,536 | 101 | 45.9 | 40.7 | 33.7 | 31.7 | 19.7 | 249.4 | |
Fosmid jump | 2 | 20 | 35,295 | 76* | 5.3 | 4.0 | 3.0 | 0.4 | 0.3 | 49.5 | |
Total | 5 | 43 | 103.1 | 86.5 | 75.1 | 70.0 | 56.5 | 326.7 | |||
Mouse | Fragment | 1 | 3 | 168 | 101 | 58.6 | 53.1 | 49.6 | 46.6 | 45.3 | 37.6 |
Short jump | 3 | 20 | 2,209 | 101 | 48.0 | 40.7 | 35.1 | 32.0 | 19.9 | 219.1 | |
Long jump | 5 | 50 | 7,532 | 26 | 13.5 | 9.3 | 9.2 | 5.5 | 2.9 | 408.3 | |
Fosmid jump | 1 | 30 | 38,453 | 76 | 1.4 | 1.1 | 1.1 | 0.1 | 0.1 | 23.1 | |
Total | 10 | 103 | 121.5 | 104.2 | 95.0 | 84.2 | 68.2 | 688.1 |
The data used as assembly input are shown. Tables S1 and S2 provide more detail. Library type: See Table 1. DNA used: Amount of DNA used as input to library construction. For each genome and each library type, a single aliquot was used. DNA source for human: Coriell Biorepository, NA12878. DNA source for mouse: Jackson Laboratory C57/BL6J (stock 000664). Size: Mean of observed fragment size distribution. Read length: Number of bases sequenced. The exception is the long jump libraries prepared with the EcoP15I digestion, which yield 26 bases of genomic information; these inserts were sequenced to 36 bases and then trimmed to 26 bases. Sequence coverage: All reads were used in the assembly, but we describe their properties here via a series of nested categories. All: Total number of bases in reads, divided by genome size, assumed to be the reference size of 3.10 Gb for human and 2.73 Gb for mouse. PF: Coverage by purity-filtered (PF) reads. Aligned: Coverage by aligned PF reads. Unique: Coverage by aligned PF reads, exclusive of duplicates, which were identified by concurrence of start and stop points of pairs on the reference. Valid: Coverage by unique pairs for which the fragment length was within 5 SDs of the mean. Physical coverage: Total coverage by valid pairs and the bases between them.
*Reads from one library had length 76, and those from the other had length 101.