Table 4.
Name | Species | Size | Cov. | N50 |
---|---|---|---|---|
PB-ce-40X | Caenorhabditis elegans | 104M | 45 | 16 572 |
ERS473430 | Citrobacter koseri | 4.9M | 106 | 7543 |
ERS544009 | Yersinia pseudotuberculosis | 4.7M | 147 | 9002 |
ERS554120 | Pseudomonas aeruginosa | 6.4M | 90 | 7106 |
ERS605484 | Vibrio vulnificus | 5.0M | 155 | 5091 |
ERS617393 | Acinetobacter baumannii | 4.0M | 237 | 7911 |
ERS646601 | Haemophilus influenzae | 1.9M | 258 | 4081 |
ERS659581 | Klebsiella sp. | 5.1M | 129 | 8031 |
ERS670327 | Shimwellia blattae | 4.2M | 155 | 6765 |
ERS685285 | Streptococcus sanguinis | 2.4M | 224 | 5791 |
ERS743109 | Salmonella enterica | 4.8M | 188 | 6051 |
PB-ecoli | Escherichia coli | 4.6M | 160 | 13 976 |
PBcR-PB-ec | Escherichia coli | 4.6M | 30 | 11 757 |
PBcR-ONT-ec | Escherichia coli | 4.6M | 29 | 9356 |
MAP-006-1 | Escherichia coli | 4.6M | 54 | 10 892 |
MAP-006-2 | Escherichia coli | 4.6M | 30 | 10 794 |
MAP-006-pcr-1 | Escherichia coli | 4.6M | 30 | 8080 |
MAP-006-pcr-2 | Escherichia coli | 4.6M | 60 | 8064 |
Evaluation dataset name, species, reference genome size, theoretical sequencing coverage and the N50 read length. Names starting with ‘MAP’ are unpublished recent ONT data provided by the Loman lab (http://bit.ly/loman006). Names starting with ‘ERS’ are accession numbers of unpublished PacBio data from the NCTC project (http://bit.ly/nctc3k). PB-ecoli and PB-ce-40X are PacBio public datasets sequenced with the P6/C4 chemistry (http://bit.ly/pbpubdat; retrieved on 11/03/2015). PBcR-PB-ec is the PacBio sample data (P5/C3 chemistry) used in the tutorial of the PBcR pipeline; PBcR-ONT-ec is the ONT example originally used by Loman et al. (2015). ‘pls2fasta –trimByRegion’ was applied to ERS* and PB-ecoli datasets as they do not provide read sequences in the FASTQ format.