Diapause development RNAseq, combined assembly

Summary

Type	Analysis
Name	Diapause development RNAseq, combined assembly
Description	Samples from four time points (November, January, March and May) span the diapause maintenance (November), termination (January), and post-diapause quiescent stages (March and May) of development. Raw 100 bp Illumina reads from each sample were quality trimmed using a Phred quality score cutoff of 19 with DynamicTrim (SolexaQA_v.2.2), and only those reads with a minimum length of 50 bp were retained using LengthSort (SolexaQA_v.2.2). Transcript assembly and differential gene expression analyses was carried out using the Tuxedo Suite. Based on an Agilent 2100 Bioanalyzer report on the sequenced libraries, we calculated the --mate-inner-dist and the --mate-std-dev parameters for TopHat2 (v2.0.10) transcript alignment to the Mrot_1.0 genome assembly. The inner distance between the mate pairs was calculated to be 0 bp by subtracting the adapter length (~125 bp) and the length of the paired-end reads (200 bp) from the average size of the Illumina libraries (324 bp). Because the size distribution of the library had a right skew, we used a conservative standard deviation of 24 bp, one-third the calculated standard deviation of the library. In addition, because the libraries were stranded we used the --library-type fr-firststrand parameter in TopHat2. Data from both Illumina lanes for each sample were pooled. Next, Cufflinks (v2.1.1) was used to generate transcripts from the TopHat2 output for each sample. Then all transcripts were combined using Cuffmerge (v1.0.0) to generate a single reference transcript assembly. Transcript descriptions were assigned based on best hit alignment to Apis mellifera RefSeq proteins and NCBI’s nr database using NCBI’s BLASTX. Gene ontology terms were assigned using Blast2GO Pro and KEGG (Kyoto Encyclopedia of Genes and Genomes) Pathways were assigned using the DAVID Knowledgebase based on the protein GI identifiers obtained from protein alignments using BLASTX.
Software Name	Tuxedo Suite (TopHat2 v2.0.10/Cufflinks v2.1.1/Cuffmerge v1.0.0)
Software Version	NA
Data Source Name	Diapause development RNAseq
Organism	Megachile rotundata