Abstract
The ever-increasing number of text articles on the internet necessitates automated key phrase tagging. Information retrieval is aided by automatic key phrase generation in resources. An automated approach that extracts the key ideas directly from the text is needed to produce key phrases for texts from all possible domains. We present a graph based unsupervised framework for summarization that extracts keywords from a single document automatically. In this paper, an improved page rank algorithm is proposed for directed graph. Initially the document is taken as an input, then the input document is converted into a graph. The sentences are represented by the nodes of the graph and the similarity scores between the sentences are represented by the edges. To calculate the similarity score between the sentences we combine the result of cosine similarity and semantic similarity. We set a threshold value as an average of similarity scores, if the edge weight is more than or equal to the threshold value, then only edge will be included into a graph. The graph is converted into a directed graph then apply the proposed page rank algorithm on it to calculate the rank of the sentences. Our model is executed on the following data sets: Inspec (Hulth, 2003) data set and NUS (Nguyen and Kan, 2007) data set for keyphrase extraction and DUC 2002 data set and BBC News Articles for text summarization.