* Check RNA-seq read quality with FastQC. Version: 0.10.1 * Map reads to genome with Tophat. * Build genome index for Tophat using ToxoDB GT1 genome version 9.0. Version: bowtie2-intel/2.1.0 Command: bowtie2-build ToxoDB-9.0_TgondiiGT1_Genome.fasta * Map reads with Tophat. Versions: tophat-gcc/2.0.8 bowtie2-intel/2.1.0 samtools-intel/0.1.19 boost-intel/1.53.0 Command: tophat -p 8 --output-dir <1.fastq> <2.fastq> * Check quality of mapping with flagstats Version: samtools-intel/0.1.19 Command: samtools flagstat * Check quality of mapping with RNA-SeQC. Version: java/1.6.0_25 Command: java -Xmx42g -jar /path/to/RNA-SeQC_v1.1.7.jar -r -s -o -t * Create gene models with Cufflinks. Versions: cufflinks-gcc/2.1.1 tophat-gcc/2.0.8 bowtie2-intel/2.1.0 samtools-intel/0.1.19 Command: cufflinks -p 8 -o --overlap-radius 1 --min-isoform-fraction 0.04 --pre-mrna-fraction 0.1 * Merge gene models with Cuffmerge Version: cufflinks-gcc/2.1.1 Command: cuffmerge -p 8 -o -s * Truncate transcripts to CDS boundaries with GeneGuillotine. * Manually inspect reference genome for overlapping genes and remove overlapping fragments. If you run GeneGuillotine first, it will inform you where these are. * Run GeneGuillotine. Versions: GeneGuillotine/1.0 ruby-gcc/1.9.2 Command: geneguillotine.rb -i -g -o * Analyse differential expression with limma/voom Versions: limma/3.18.1 edgeR/3.4.0 Rsubread/1.12.1 R/3.0 Commands: t4h_fcounts <- featureCounts(files=t4h_targets$BamFile,file.type="BAM",annot.ext=gff_file,isGTFAnnotationFile=TRUE,nthreads=8,isPairedEnd=TRUE,PEReadsReordering=TRUE) t4h_isexpr <- rowSums(cpm(t4h_fcounts$counts) > 3) >= 3 t4h_x <- t4h_fcounts$counts[t4h_isexpr,] t4h_y <- voom(t4h_x,t4h_design,plot=TRUE) plotMDS(t4h_y,xlim=c(-2.5,2.5)) t4h_fit <- eBayes(lmFit(t4h_y,t4h_design)) topTable(t4h_fit,coef=2) write.csv(topTable(t4h_fit,coef=2,number=Inf,p.value=0.05),file="/tmp/limma_out_4h") * Analyse differences in alternative splicing with DEXSeq. Versions: DEXSeq/1.8.0 python-gcc/2.7.5 R/3.0 Commands: * Modify dexseq_prepare_annotation.py to work in unstranded mode, by changing exons = HTSeq.GenomicArrayOfSets( "auto", stranded=True ) to exons = HTSeq.GenomicArrayOfSets( "auto", stranded=False ) * Create flattened GFF. python dexseq_prepare_annotation_unstranded.py * For each bam file, samtools view -o 8_2_v2.sam 8_2_v2.bam sort -k1,1 -k2,2n 8_2_v2.sam > 8_2_v2_sorted.sam python dexseq_count.py --p yes -r name -s no * Import into R, then ecs0_vs_4 = read.HTSeqCounts(countfiles = file.path(inDir, paste(rownames(samples))), design = samples, flattenedfile = annotationfile) sampleNames(ecs0_vs_4) = rownames(samples) ecs0_vs_4 <- estimateSizeFactors(ecs0_vs_4) sizeFactors(ecs0_vs_4) ecs0_vs_4 <- estimateDispersions(ecs0_vs_4, nCores=8) ecs0_vs_4 <- fitDispersionFunction(ecs0_vs_4) ecs0_vs_4 <- testForDEU( ecs0_vs_4, nCores=8) ecs0_vs_4 <- estimatelog2FoldChanges( ecs0_vs_4 ) res_0_vs_4 <- DEUresultTable(ecs0_vs_4) table ( res_0_vs_4$padjust < 0.05 ) table ( tapply( res_0_vs_4$padjust < 0.05, geneIDs(ecs0_vs_4), any ) ) plotMA( ecs0_vs_4, FDR=0.5, ylim=c(-4,4), cex=0.8 ) DEXSeqHTML( ecs0_vs_4, FDR=0.05, color=c("#FF000080", "#0000FF80") ) * Analyse alternative splicing with JunctionJuror. Versions: JunctionJuror/1.0 ruby-gcc/1.9.2 Commands: junctionjuror.rb -j -g -o -t 3