Table 1:
Gap-closing accuracy statistics and computational consumption for TGS-GapCloser
Input data | Accuracy in long-read selection | Accuracy in single-base level | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
No. of closed gaps | No. of closed gaps in theory | PPV (%) | Sensitivity (%) | Runtime (hours) | Peak memory usage (GB) | No. of filled bases (bp) | No. of filled bases in theory (bp) | QV of input raw long reads (Phred) | QV of input scaftigs (Phred) | QV of filled long reads with correction (Phred) | QV of output scaftigs (Phred) | |
MaSuRCA + SLR-superscaffolder + TGS-GapCloser (ONT) | 75,629 | 74,353 | 96.6 | 96.3 | 259 | 50 | 335,541,557 | 353,352,038 | 7.63 | 40.51 | 23.24 | 36.06 |
MaSuRCA + SLR-superscaffolder + TGS-GapCloser (PacBio) | 74,321 | 74,353 | 98.2 | 89.8 | 13 | 33 | 198,327,815 | 353,352,038 | 26.99 | 40.51 | 35.52 | 37.64 |
Mercedes + SLR-superscaffolder + TGS-GapCloser (ONT) | 58,938 | 61,267 | 97.7 | 93.4 | 145 | 51 | 352,316,717 | 497,208,670 | 7.63 | 48.09 | 23.23 | 40.19 |
Mercedes + SLR-superscaffolder + TGS-GapCloser (PacBio) | 52,116 | 61,267 | 98.4 | 75.6 | 11 | 32 | 146,148,151 | 497,208,670 | 26.99 | 48.09 | 36.25 | 42.29 |
Supernova + TGS-GapCloser (ONT) | 22,563 | 24,760 | 62.0 | 51.2 | 163 | 74 | 49,669,581 | 38,276,270 | 7.63 | 48.72 | 23.15 | 46.11 |
Supernova + TGS-GapCloser (PacBio) | 26,919 | 24,760 | 76.1 | 61.2 | 20 | 38 | 22,178,115 | 38,276,270 | 26.99 | 48.72 | 34.82 | 46.48 |
All datasets were run with 42 threads. Note that the peak memory consumption by Pilon or Racon is not counted. The higher speed of runs using the PacBio HiFi dataset mainly originates from the use of Racon to correct fragments with long reads. Note that QUAST accepts <10 continuous N's in the scaftig. PPV: positive predictive value; QV: quality value.