Abstract
The degree to which certain genes are relevant to biological inquiries can be an open question. State-of-the-art computational methods can help understanding functional associations between gene expression patterns and subsequent biological experiments. Therefore, the basic scientific need to correlate significant differential expression levels to biological variation phenomena is highly demanded. RNA-sequencing (RNA-seq) methods employ next-generation sequencing (NGS) technology towards scanning RNA molecules in samples and quantify their amount. Development of crops that can overcome environmental stresses, while maintaining productivity, proved to be a basic necessity for agricultural productivity. Arabidopsis thaliana is an ideal model organism for studying biologically relevant questions about global gene regulation in response to stresses. The main purpose of this study is to identify differentially expressed genes in A. thaliana under heat-stress conditions. A workflow for RNA-seq analysis is proposed to identify these genes using; edgeR and Fisher criterion (FC) analysis methods. The identified candidate genes are validated via two popular references; DRASTIC and TAIR10. Results suggest that these two methods can be combined to perform differential expression analysis within RNA-Seq data, without strong assumptions. Comparative evaluation of the proposed methods demonstrates successful identification of stress-related genes, with improved prediction accuracy. This shows that presented workflow and the differential analysis methods can be applied to identify differentially expressed genes from RNA-seq data for other organisms. Finally, literature based verification for the top 5% detected genes shared between FC and edgeR methods is demonstrated. Suitable justification is given to help discover newly response-related genes to heat phenomenon.