SIMILARITY-DISSIMILARITY PLOT FOR HIGH DIMENSIONAL DATA OF DIFFERENT ATTRIBUTE TYPES IN BIOMEDICAL DATASETS

Muhammad Arif; Saleh Basalamah

Back

SIMILARITY-DISSIMILARITY PLOT FOR HIGH DIMENSIONAL DATA OF DIFFERENT ATTRIBUTE TYPES IN BIOMEDICAL DATASETS

Journal article

Peer reviewed

SIMILARITY-DISSIMILARITY PLOT FOR HIGH DIMENSIONAL DATA OF DIFFERENT ATTRIBUTE TYPES IN BIOMEDICAL DATASETS

Muhammad Arif and Saleh Basalamah

INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, Vol.8(2), pp.1275-1297

01/02/2012

Abstract

Computer Science

Computer Science, Artificial Intelligence

Science & Technology

Technology

In real life biomedical classification applications, feature space may be of high dimension in which visualization of class distribution is impossible. Moreover, attributes of features may be numeric, ordinal, categorical or binary. Most of the time, features may be composed of mixed type of attributes. In this paper, the concept of similarity-dissimilarity is extended to various types of attributes. Similarity-dissimilarity plot projects the high dimensional feature space on two dimensional plot revealing the class separation in the feature space which may be continuous or discrete. Furthermore, effect, of various distance measures proposed in the literature for different type of attributes is also studied. An index called percentage of data points above the similarity-dissimilarity line (PAS) is proposed which is the fraction of data points found near to its own class as compared to other classes. Several real life biomedical datasets are used to show the effectiveness of the proposed similarity-dissimilarity plot and the PAS index.

Metrics

1 Record Views

Details

Title: SIMILARITY-DISSIMILARITY PLOT FOR HIGH DIMENSIONAL DATA OF DIFFERENT ATTRIBUTE TYPES IN BIOMEDICAL DATASETS
Creators - without role: Muhammad Arif - Netaji Subhas National Institute of Sports
Saleh Basalamah - Netaji Subhas National Institute of Sports
Publication Details: INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, Vol.8(2), pp.1275-1297
Publisher: ICIC INT
Number of pages: 23
Identifiers: 9931327608331
Academic Unit: Umm Al Qura University
Language: English
Resource Type: Journal article