Abstract
Conference Title: 2018 13th International Conference on Computer Engineering and Systems (ICCES) Conference Start Date: 2018, Dec. 18 Conference End Date: 2018, Dec. 19 Conference Location: Cairo, Egypt Twitter is the most popular micro-blogging medium that allows users to exchange short messages, provides a platform for public people to share the news. Nowadays, Twitter counts with an average of 328 million monthly active users and is growing rapidly. Detecting the credibility of shared information on Twitter becomes a necessity, especially during high impact events. In this paper a classification model based on supervised machine learning techniques is proposed to detect credibility. The proposed model uses an extensive set of features including both content-based and source-based features. The research compares the performance of five different machine learning classifiers using three feature sets: content based, source based and a combination of both sets. The best performance is achieved when using a combined set of features and applying Random Forests as a classifier with accuracy 78.4%, precision 79.6%, recall 91.6% and f1-measure 85.2%. Experiments also revealed that the proposed model achieves improvement of 22% when compared to CRF which applies the same approach in terms of F1-measure. Feature analysis is presented to highlight the importance of the source-based features compared with the content-based features as deciders for credibility.