Skip to main content
. 2014 Aug 29;2(9):apps.1400042. doi: 10.3732/apps.1400042

Table 1.

Hyb-Seq target enrichment probe design and bioinformatics pipeline. A script combining and detailing the steps of the probe design process, Building_exon_probes.sh, is provided in the supplementary materials (Appendix S1 (44.6KB, pdf) ).

Steps Description Primary program or custom script
Probe design
    Match Find genome and transcriptome sequences with 99% identity. BLATa
    Filter Retain single hits of substantial length. Part of Building_exon_probes.shb
    Cluster Remove isoforms and loci sharing >90% identity. CD-HIT-ESTc, grab_singleton_clusters.pyb
    Filter Retain loci with long exons summing to desired length. blat_block_analyzer.pyb
    Cluster Remove exons sharing >90% identity. CD-HIT-ESTc, grab_singleton_clusters.pyb
Short read processing and data analysis
    Read processing Adapter trimming, quality filtering Trimmomaticd
    Exon assembly Reconstruct a sequence for each sample, for each exon. YASRAe, Alignreadsf
    Identify assembled contigs If contig identity is unknown, identify which targeting exon(s) it corresponds to. BLATa
    Sequence alignment I: Collate exons Cluster orthologous exons across samples. assembled_exons_to_fasta.pyb
    Sequence alignment II: Perform alignment Align homologous bases within each exon. MAFFTg
    Concatenate exons For each locus, concatenate the aligned exons. catfasta2phyml.plh
    Gene tree construction For each locus, estimate the maximum likelihood gene tree. RAxMLi
    Species tree construction Estimate the species tree from independent gene trees in a coalescent framework. MP-ESTj
b

New scripts written for this protocol, an example data set, and any future updates are available at https://github.com/listonlab/.