Improving Arabic Text Categorization using Decision Trees

Fouzi Harrag; Eyas El-Qawasmeh; Pit Pichappan; IEEE

doi:10.1109/NDT.2009.5272214

Back

Conference proceeding

Improving Arabic Text Categorization using Decision Trees

Fouzi Harrag, Eyas El-Qawasmeh, Pit Pichappan and IEEE

NDT: 2009 FIRST INTERNATIONAL CONFERENCE ON NETWORKED DIGITAL TECHNOLOGIES, pp.110-115

01/01/2009

DOI: https://doi.org/10.1109/NDT.2009.5272214

Abstract

Computer Science

Computer Science, Information Systems

Science & Technology

Technology

This paper presents the results of classiying Arabic text documents using a decision tree algorithm. Experiments are performed over two self collected data corpus and the results show that the suggested hybrid approach of Document Frequency Thresholding using an embedded information gain criterion of the decision tree algorithm is the preferable feature selection criterion. The study concluded that the effectiveness of the improved classifier is very good and gives generalization accuracy about 0.93 for the scientific corpus and 0.91 for the literary corpus and we also conclude that the effectiveness of the decision tree classifier was increased as we increase the training size, and the nature of the corpus has such a influence on the classifier performance.

Metrics

1 Record Views

Details

Title: Improving Arabic Text Categorization using Decision Trees
Creators - without role: Fouzi Harrag - University Ferhat Abbas of Setif
Eyas El-Qawasmeh - JUST Univ, Comp Sci Dept, Amman 25000, Jordan
Pit Pichappan - Imam Mohammad ibn Saud Islamic University
IEEE
Publication Details: NDT: 2009 FIRST INTERNATIONAL CONFERENCE ON NETWORKED DIGITAL TECHNOLOGIES, pp.110-115
Publisher: IEEE
Number of pages: 2
Identifiers: 9952997708331
Academic Unit: King Saud University
Language: English
Resource Type: Conference proceeding