. 2017 Jun 21;7(8):2763–2778. doi: 10.1534/g3.117.043893

Table 3. Data dependencies required to successfully run each component of the McClintock pipeline.

	ngs_te_mapper	RelocaTE	TEMP	RetroSeq	PoPoolationTE	TE-locate
Reference genome (fasta)	✓	✓	✓	✓	✓	✓
Canonical TE sequences (fasta)	✓	✓^a	✓	✓^b	✓
Annotation of reference TEs (GFF)					✓	✓
Annotation of reference TEs (BED)			✓	✓^c
Annotation of reference TEs (custom format)		✓
Unaligned reads (single-end fastq)	✓	✓
Unaligned reads (paired-end fastq)					✓
Aligned reads (BAM)			✓	✓
Aligned reads (lexically sorted SAM)						✓
TE hierarchy (custom format)			✓		✓

Must include an entry in the format “TSD=…” for each TE in the file on the same line as the header, where “…” is the TSD sequence if known, or a string of periods with equal to the TSD length if the TSD sequence is unknown. If neither length nor the sequence of the TSD is known, “TSD=UNK” can be supplied.

Must be formatted as one fasta file per TE family and a file of files listing their locations.

Must be one BED file for each entry in the reference TE annotation and a file of files listing their locations.