Abstract
Various socio-political organizations, from activist groups to propaganda campaigners, create accounts on Twitter to reach out, influence and gain followers. In order to analyze the impact of these organizational accounts, the first step is to identify them. In this paper, we develop and experiment with a set of network-based, behavioral, temporal and spatial characteristics in these accounts, independent of domain or language, to identify features that can be useful in detecting organizational accounts. In order to assess this model, we experimented with a microblog corpus comprised of over 7 million tweets from 150,000 Twitter users in Bangladesh, tweeted between June and October 2016. We sampled 31,139 accounts using cold-start heuristics to locate and label nearly 200 organizational accounts, distributed as 68 NGOs, 62 news outlets, 35 political groups, and 17 public intellectual and iconic figures. The remaining accounts were labeled as individuals. Next, we developed a set of features and experimented with a set of linear and non-linear classifiers. The highest performing sparse logistic regression classifier achieved an accuracy of 68.2% precision and 64.4% recall leading to a 66.2% Flscore in detecting less than 1% rare organizational accounts using a set of content- and language-independent features.