Table 1.
Data type | Format | Implementation |
---|---|---|
Feature annotations (e.g. genes, transcripts, exons, origins of replication) | BED, extended BED* | Plastid |
BigBed | Plastid + kentUtils [46] | |
GTF2* | Plastid | |
GFF3* | Plastid | |
PSL* | Plastid | |
Read alignments | bowtie | Plastid |
BAM | Plastid + Pysam [27] | |
Reduced count data | bedGraph | Plastid |
BigWig | Plastid + kentUtils [46] | |
wiggle (fixedStep) | Plastid | |
wiggle (variableStep) | Plastid | |
Sequence | FASTA | via Biopython [20] |
twobit | via twobitreader [21] |
For each category of genomics data, many file formats exist. Plastid includes readers for each format that standardize the representation of data for each type, so that the meaning of each data type is separated from its format on disk. *tabix compression for these formats is supported via Pysam [27]