Abstract
Biomedicine, health care, and life sciences have recently played a significant role in data and information-intensive science. Particularly in the area of bioinformatics and computational biology, there is tremendous growth in data that could be noisy data, multidimensional, unstructured data or structured data, and the diversity of highly complex data. Therefore, a specific modelling and integrative analysis system is required. The present study focuses on developing a conceptual framework using deep learning approach to predictive modelling of diseases in bioinformatics using data from genome sequences. Initially, the data is pre-processed using Min-Max Standardization approach where it cross verifies the missing value and data scaling has been performed. Second, the significant features are selected using random forest method and it gets extracted using the deep learning-based auto-encoder method. Third, the data classification has been done with the help of XG-boost classifier technique. At last, the performance of suggested model has been tested using TCGA-PANCAN dataset then compared the performance with traditional method in terms of precision, recall, f-measure, accuracy, success rate, Fscore and error rate.