Abstract
Due to the large amount of Arabic text produced on a daily basis, there is a need to analyze these texts. Following a comprehensive literature review, there become clear several issues related to Arabic text summaries, keyword extraction, and sentiment analyses. These issues occur owing to several factors, such as the structure and morphology of Arabic text, a lack of machine-readable Arabic dictionaries, insufficient tools to manage Arabic text, no standard datasets, inherently cursive scripts, and isolated characters; thus, there is a need to create Arabic text in forms that can be easily read by machine learning, deep learning algorithms, and existing analysis tools. To achieve this, the Arabic texts must be converted into English texts. This paper proposes a lexicon called the AEC-Lexicon for use by all researchers working in Arabic, which is based on the Arabic case system and converts Arabic text into English text. Based on the experimental results of latent semantic indexing (LSI), it was found that texts generated from the proposed work exhibited a significant improvement over existing work (converted Arabic to English texts), considering reading and understanding as well as the relevance to the original Arabic text.