Set-based unified approach for summarization of a multi-attributed graph

Kifayat Ullah Khan; Waqas Nawaz; Young-Koo Lee

doi:10.1007/s11280-016-0388-y

Back

Set-based unified approach for summarization of a multi-attributed graph

Journal article

Peer reviewed

Set-based unified approach for summarization of a multi-attributed graph

Kifayat Ullah Khan, Waqas Nawaz and Young-Koo Lee

World wide web (Bussum), Vol.20(3), pp.543-570

01/05/2017

DOI: https://doi.org/10.1007/s11280-016-0388-y

Abstract

Computer Science

Computer Science, Information Systems

Computer Science, Software Engineering

Science & Technology

Technology

Rich availability of real world knowledge in a graph based on attributes of each vertex and its interactions, is a valuable source of information. However, it is hard to derive this useful knowledge since either graphs of current era do not fit in main memory or cannot be efficiently processed. In this regard, it is better to create a meaningful summary graph that is compact yet preserves intrinsic properties of its underlying graph. In this paper, we propose a summarization approach for a big graph, where each node is attached with multiple attributes. Main intuition behind our approach is based on a real life concept that tells "friends of friends have many common friends and also have similar likes and preferences". We use this phenomenon as the basis in our paper to identify sets of nodes having common neighborhood and similar attributes, for summarization. Existing aggregation-based summarization methods use pairwise heuristic to find similar pairs of nodes for compression. Whereas, pairwise similarity computations can check both neighborhood as well as attributes similarities, however, it is impractical to summarize a big graph. For this purpose, we propose a set-based approach for efficient summarization. To identify each set, we adopt Locality Sensitive Hashing (LSH) to restrict similarity computations within candidate similar nodes only. Since, existing LSH techniques only consider neighborhood similarity in a graph, therefore we propose a Unified LSH approach to simultaneously consider both attributes and neighborhood similarities. Further, using Minimum Description Length (MDL) principle, we present a new technique to perform lossless summarization of each set by creating a super node or adding a new virtual node in summary graph. We evaluate our proposed approach with state of the art methods on synthetic and publicly available real world graphs and observe better results in terms of execution time, compression ratio, and number of corrections to restructure the original graph.

Metrics

1 Record Views

Details

Title: Set-based unified approach for summarization of a multi-attributed graph
Creators - without role: Kifayat Ullah Khan - Kyung Hee University
Waqas Nawaz - Kyung Hee University
Young-Koo Lee - Kyung Hee University
Publication Details: World wide web (Bussum), Vol.20(3), pp.543-570
Publisher: Springer Nature
Number of pages: 28
Grant note: 2015R1A2A2A01008209 / National Research Foundation of Korea(NRF) - Korea government(MEST); Ministry of Education, Science & Technology (MEST), Republic of Korea; National Research Foundation of Korea; Korean Government
Identifiers: 9916803908331
Academic Unit: Islamic University of Al Madinah
Language: English
Resource Type: Journal article