Abstract
This study describes the development of a text document clustering optimization model using a novel Genetic Algorithm-Shuffled Frog Leaping Algorithm (GA-SFLA), which clusters text documents based on selected features in an efficient way. In the approach proposed in this study, a Genetic Algorithm (GA) handles the feature selection task, while a Shuffled Frog-Leaping Algorithm (SFLA) handles the clustering task. The effectiveness of the proposed approach was evaluated by testing it on the popular "20Newsgroup" text document dataset. After multiple experiments, it was found that using GA-SFLA on the 20Newsgroup dataset considerably helped to enhance the text document clustering task, compared to classical K-means clustering. In addition, the feature selection stage greatly contributed to improving the results of clustering. Nevertheless, this improvement comes at the expense of longer computational time.