Abstract
Document classification is one of the prominent area of research evolved as a result of exponential growth in the usage of electronic documents. Classification of documents demands for understanding of document units by removing insignificant data and improving computational efficiency. This paper deals with the approaches aimed at Dimensionality Reduction (DR) in document units for Telugu. Bag of words is a generic model for English document classification, adaptation of this model on Indic based scripts found to have a meager performance. Two approaches are presented in this paper, first approach deals with language specific and Corpus based dimensionality reduction termed as validity based DR. The other approach is Category and Document specific approach termed as category based DR. The performance of the two approaches is evaluated with the help of accuracy as a measure.