Abstract
Supervised machine learning methods for automatic subjectivity and sentiment analysis (SSA) are problematic when applied to social media, such as Twitter (C), since they do not generalise well to unseen topics. A possible remedy of this problem is to apply distant supervision (DS) approaches, which learn from large amounts of automatically annotated data. In this research, we explore DS for SSA on Arabic Twitter feeds using emoticons as noisy labels. We achieve 95.19% accuracy, which is a 48.57% absolute improvement over our previous fully supervised results. While our results show a significant gain in detecting subjectivity, this approach proves to be difficult for sentiment analysis. An error analysis suggests that the most likely cause for this shortcoming is the unclear facing of emoticons due to the right-to-left direction of the Arabic alphabet.