Table 1. Summary assembly and annotation statistics for Illumina sequencing of A. rabiei isolate, ArD2 (Verma et al. 2016) and PacBio SMRT sequencing for ArME14.
Assembly statistics | ArD2 Illumina (13) | ArME14 PacBio SMRT |
---|---|---|
Genome size (bp) | 34,658,250 | 40,927,385 |
Total sequenced bases | 100 Gb | ∼6.8 Gb |
Coverage | 178x | 166x (928,353 reads) |
Number of scaffolds/contigs | 338 a | 33 b |
Largest scaffold/contig size (bp) | 1,160,210 a | 3,373,759 b |
L50 | 64 a | 9 b |
N50 (bp) | 154,808 a | 1,812,190 b |
GC (%) | 51.6 | 49.2 |
% Repetitive sequence | 9.9 | 12.6 |
Complete chromosomes | — | 12 |
Annotation statistics | ||
Number of protein coding genes | 10,596 | 11,257 |
Predicted secreted proteins | 758 c (1,111 d) | 1,145 c |
Predicted effectors | 328 c (36 d) | 39 c |
Predicted sec. metabolite clusters | 26 e | 26 |
Predicted no. of CAZymes | 1,727 c (441 f) | 451 c f |
Scaffolds for ArD2 Illumina assembly (GCA_001630375.1).
Contigs for ArME14 PacBio SMRT assembly.
Differences in numbers likely due largely to different selection criteria.
Secretome and effector predictions for ArD2 assembly using the same methods applied to ArME14.
Unknown prediction method for secondary metabolite clusters.
CAZyme prediction using dbCAN2 meta server in this study.