. 2017 Dec 5;4:170178. doi: 10.1038/sdata.2017.178

Table 1. Open access tools and commands used to perform data analyses (analytical steps correspond to those in Fig. 2).

Analytical Step	Description	Software/Version	Command
Read QC	Read quality control	FastQC v0.11.5	fastqc /path_to/raw.fq.gz (Data Citation 1 and Data Citation 2)
Clean Reads	Adaptor and low quality trimming	TRIMMOMATIC v0.36	java -jar /path_to/trimmomatic-0.36.jar PE -phred33 -threads 8 raw_R1.fq.gz raw_R2.fq.gz clean_FP.fq.gz clean_FU.fq.gz clean_RP.fq.gz clean_RU.fq.gz HEADCROP:10 ILLUMINACLIP:/path_to/adapters_list.fa:2:30:10 TRAILING:10 SLIDINGWINDOW:4:10 MINLEN:25
De novo assembly	Transcriptome assembly	Trinity v2.2.0	Trinity --seqType fq --left clean_FP.fq.gz --right clean_RP.fq.gz --CPU 20 --max_memory 150G --SS_lib_type RF --output trinity_assembly
Assembly curation	Filtering out contigs with low read support	Transrate v1.0.3	transrate --assembly Ltimidus_Trinity.fasta --left clean_FP.fq.gz --right clean_RP.fq.gz --threads 10 --reference Oryctolagus_cuniculus.OryCun2.0.81.pep.all.fa --output transrate_Ltimidus_Trinity
Remove redundancy	Clustering of highly homologous sequences	CD-HIT-EST v4.6.4	cd-hit-est -i good.Ltimidus_Trinity.fasta -c 0.95 -o AlpsIrel.fasta
ORF prediction	Filtering based on candidate coding regions and pfam annotation	TransDecoder v3.0.0	TransDecoder.LongOrfs -t AlpsIrel.fasta
		HMMER v3.1b2	hmmscan --cpu 8 --domtblout pfam.domtblout /path_to/Pfam-A.hmm transdecoder_dir/longest_orfs.pep
		TransDecoder v3.0.0	TransDecoder.Predict -t AlpsIrel.fasta --cpu 2 --retain_pfam_hits pfam.domtblout
Annotation	Annotation assessment	Trinotate v3.0.1	wget "https://data.broadinstitute.org/Trinity/Trinotate_v3_RESOURCES/Trinotate_v3.sqlite.gz" -O Trinotate.sqlite.gz
	Annotation assessment	Gunzip	gunzip Trinotate.sqlite.gz
	Conditional reciprocal best blast annotation	crb-blast v0.6.6	crb-blast --query AlpsIrel.cds --target database(SP and Ocun) --threads 4 --split 4 --output blastx.outfmt6
	Conditional reciprocal best blast annotation	crb-blast v0.6.6	crb-blast --query AlpsIrel.pep --target database(SP and Ocun) --threads 4 --split 4 --output blastp.outfmt6
	Signalp annotation	signalp v4.1	signalp -f short -n signalp.out AlpsIrel.pep
	Pfam annotation	HMMER v3.1b2	hmmscan --cpu 2 --domtblout TrinotatePFAM.out Pfam-A.hmm AlpsIrel.pep
	tmhmm annotation	tmHMM v2.0	tmhmm --short < AlpsIrel.pep > tmhmm.out
	Combine annotations	Trinity utilities v2.2.0	/path_to/trinityrnaseq-2.2.0/util/support_scripts/get_Trinity_gene_to_trans_map.pl AlpsIrel.fasta >AlpsIrel.gene_trans_map
	Combine annotations	Trinotate v3.0.1	Trinotate Trinotate.sqlite init --gene_trans_map AlpsIrel.gene_trans_map --transcript_fasta AlpsIrel.fasta --transdecoder_pep AlpsIrel.pep
	SwissProt annotation load	Trinotate v3.0.1	Trinotate Trinotate.sqlite LOAD_swissprot_blastp SP.blastp.outfmt6 #and# Trinotate Trinotate.sqlite LOAD_swissprot_blastx SP.blastx.outfmt6
	O.cuniculus annotation load	Trinotate v3.0.1	1. Trinotate Trinotate.sqlite LOAD_custom_blast --outfmt6 Ocun.blastp.outfmt6 --prog blastp --dbtype Ocun; 2. Trinotate Trinotate.sqlite LOAD_custom_blast --outfmt6 Ocun.blastx.outfmt6 --prog blastx --dbtype Ocun
	Pfam annotation load	Trinotate v3.0.1	Trinotate Trinotate.sqlite LOAD_pfam TrinotatePFAM.out
	tmhmm annotation load	Trinotate v3.0.1	Trinotate Trinotate.sqlite LOAD_tmhmm tmhmm.out
	Signalp annotation load	Trinotate v3.0.1	Trinotate Trinotate.sqlite LOAD_signalp signalp.out
	Joint annotation file	Trinotate v3.0.1	Trinotate Trinotate.sqlite report > LtimidusTranscriptome.xls
Mapping	Read mapping onto the curated reference	bwa-mem v0.7.15	bwa index AlpsIrel.cds
Mapping	Read mapping onto the curated reference	bwa-mem v0.7.15	bwa mem -t 10 -R '@RG\tID:pop_sample_lane\tSM:popsample\tLB:LIBsample' AlpsIrel.cds Sample_L_FP.fq.gz Sample_L_RP.fq.gz > Sample_lane.sam
Bam conversion,sort and fixmate	Fixmate and BAM conversion	SAMtools v1.3.1	samtools fixmate --output-fmt BAM sample_lane.sam sample_lane_fixmate.bam
Bam conversion,sort and fixmate	BAM sort	SAMtools v1.3.1	samtools sort -O bam -o sample_lane_sorted.bam -T /path_to/temp/ sample_lane_fixmate.bam
Remove duplicates	Mark and remove duplicates	Picard v1.140	java -jar /path_to/picard.jar MarkDuplicates REMOVE_DUPLICATES=True MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=950 ASSUME_SORTED=true VALIDATION_STRINGENCY=SILENT I=sample_lane_sorted.bam I=sample_lane_sorted.bam I=sample_lane_sorted.bam O=sample_rmdup.bam M=duplic_stats_sample TMP_DIR=/path_to/temp
Realignment and recalibration	Realignment	GATK v3.6-0	java -jar /path_to/GenomeAnalysisTK.jar -T RealignerTargetCreator -R AlpsIrel.cds -I sample_rmdup.bam -o sample_int.list
Realignment and recalibration	Recalibration	GATK v3.6-0	java -jar /path_to/GenomeAnalysisTK.jar -T IndelRealigner -R AlpsIrel.cds -I sample_rmdup.bam -targetIntervals sample_int.list -o sample_realign.bam
SNP call	SNP call	Reads2snp v2.0.64	reads2snp_2.0.64.bin -bamlist LtimLeur_list.txt -bamref AlpsIrel.cds -out LtimVsLeur -min 10 -nbth 12 -th1 0.95 -par 1 -th2 0.01 -opt bfgs -fis 0.0 -pre 0.001 -rqt 20
Differentiation analysis	Remove indels and missing data	VCFtools v0.1.14	vcftools --vcf LtimVsLeur.vcf --recode --recode-INFO-all --remove-indels --max-missing-count 0 --out LtimVsLeur_noindels
	Extract 1 SNP per contig	VCFtools v0.1.14	vcftools --vcf LtimVsLeur_noindels.recode.vcf --recode --recode-INFO-all --thin 10000 --min-alleles 2 --out LtimVsLeur_1SNPperContig
	VCF to STRUCTURE conversion	PGDSpyder v2.1.1.0	java -Xmx1024m -Xms512m -jar /path_to/PGDSpider2-cli.jar -inputfile LtimVsLeur_1SNPperContig.recode.vcf -inputformat VCF -outputfile LtimVsLeur_SNPs -outputformat STRUCTURE -spid VCF_to_STRUCTURE.spid
	Structure analysis	STRUCTURE v2.3.4	structure -m mainparams (standard parameters except 1 million steps after a burn-in period of 200 000, K=2 and admixture model)
	Structure analysis	CLUMPACK v42089	The Web version was used - http://clumpak.tau.ac.il/
	PCA analysis	PLINK v1.90b3.45	plink --file LtimVsLeur_1SNPperContig --pca 3
	PCA analysis	ggplot2 R package v2.2.1	1. R; 2. library(ggfortify); 3. pca <- read.table('plink.eigenvec', header=TRUE); 4. df <- pca[c(3, 4)]; 5. autoplot(prcomp(df), data=pca, colour='Species.Pop', size=5)
GO enrichment	Gene Ontology enrichment analysis	g:Profiler	Available at http://biit.cs.ut.ee/gprofiler/ ; Best per parent group Hierarchical filtering; Input background manually; g:SCS significance threshold.