. 2014 Aug 29;2(9):apps.1400042. doi: 10.3732/apps.1400042

Table 1.

Hyb-Seq target enrichment probe design and bioinformatics pipeline. A script combining and detailing the steps of the probe design process, Building_exon_probes.sh, is provided in the supplementary materials (Appendix S1^{(44.6KB, pdf)}).

Steps	Description	Primary program or custom script
Probe design
Match	Find genome and transcriptome sequences with 99% identity.	BLAT^a
Filter	Retain single hits of substantial length.	Part of Building_exon_probes.sh^b
Cluster	Remove isoforms and loci sharing >90% identity.	CD-HIT-EST^c, grab_singleton_clusters.py^b
Filter	Retain loci with long exons summing to desired length.	blat_block_analyzer.py^b
Cluster	Remove exons sharing >90% identity.	CD-HIT-EST^c, grab_singleton_clusters.py^b
Short read processing and data analysis
Read processing	Adapter trimming, quality filtering	Trimmomatic^d
Exon assembly	Reconstruct a sequence for each sample, for each exon.	YASRA^e, Alignreads^f
Identify assembled contigs	If contig identity is unknown, identify which targeting exon(s) it corresponds to.	BLAT^a
Sequence alignment I: Collate exons	Cluster orthologous exons across samples.	assembled_exons_to_fasta.py^b
Sequence alignment II: Perform alignment	Align homologous bases within each exon.	MAFFT^g
Concatenate exons	For each locus, concatenate the aligned exons.	catfasta2phyml.pl^h
Gene tree construction	For each locus, estimate the maximum likelihood gene tree.	RAxMLⁱ
Species tree construction	Estimate the species tree from independent gene trees in a coalescent framework.	MP-EST^j

Kent (2002).

New scripts written for this protocol, an example data set, and any future updates are available at https://github.com/listonlab/.

Li and Godzik (2006).

Bolger et al. (2014).

Ratan (2009).

Straub et al. (2011).

Katoh and Toh (2008).

Nylander (2011).

ⁱ

Stamatakis (2006).

Liu et al. (2010).