A New Methodology for Automatic Building Arabic Field Association Terms Dictionary Using POS

El-Sayed Atlam

Back

Journal article

A New Methodology for Automatic Building Arabic Field Association Terms Dictionary Using POS

El-Sayed Atlam

INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, Vol.14(8), pp.2823-2833

01/08/2011

Abstract

Engineering

Engineering, Multidisciplinary

Science & Technology

Technology

Researches have shown that Field Association (FA) Terms are effective in document classification, similar file retrieval and passage retrieval, and holds a lot of potential for applications in natural language processing and information retrieval. Many researchers have proposed effective methods to extract automatically relevant FA Terms to build a comprehensive dictionary. However, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other language such Arabic could be definitely strengthen further researches. This paper presents a new method to extract, FA Terms from domain-specific corpora using part-of-speech (POS), pattern rules and corpora comparison in Arabic language. Experimental evaluation is carried out for 14 different fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhayah news selected average of 2,825 FA Terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79% respectively. Moreover, the quality of the FA Terms dictionary by its ability to identify the fields of 8,054 documents collected from two different sources: Wikipedia and Alhayah corpora are tested.

Metrics

1 Record Views

Details

Title: A New Methodology for Automatic Building Arabic Field Association Terms Dictionary Using POS
Creators - without role: El-Sayed Atlam
Publication Details: INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, Vol.14(8), pp.2823-2833
Publisher: Int Information Inst
Number of pages: 11
Identifiers: 9930343208331
Academic Unit: Taibah University
Language: English
Resource Type: Journal article