Abstract
YouTube (owned by Google Inc.) is arguably among most popular social media platforms used by millions across the globe. It provides an ever-growing, unique and rich source of content which presents new opportunities and challenges for information discovery and analysis. It is pertinent to explore and understand a topic via YouTube content to discover interesting information about public opinions and sentiments. This paper presents an integrated framework to facilitate the acquisition, storage, management, processing, and visualization of relevant content with the objective to assist in such analysis. It not only collects a significant portion of content, relevant to a given topic, in short time but also offers tools for visual exploratory analysis such as; (i) temporal evolution, (ii) vocabulary network, (iii) authors relative popularity and influence (iv) categories and (v) user communities and influencers. The utility and effectiveness is demonstrated through content analysis of a famous YouTube entertainment topic, the “Gangnam Style”.
•An integrated content acquisition, management, processing and visualization framework for YouTube.•A query Evolution based recursive Algorithm that improves YouTube data acquisition performance for a given topic.•Data processing algorithms for discovery of social connections and vocabulary network.•Visualization of social, spatio-temporal, linguistic, popularity and growth aspects of content.•Demonstration of features using real data of a famous YouTube topic, the “Gangnam Style”.