Abstract
In this paper we combine the neighborhood and attributes similarity to summarize big graphs where each node is attached with multiple attributes. The main intution behind our approach is that sets of nodes having common links, in graphs, usually have same attributes. Thus compressing such Sets of Similar Nodes (SSNs) can significantly reduce the size of big graphs, yet preserving the overall properties of the underlying graphs. However, efficiently finding such sets is computationally expensive. Finding these using pair wise similarity computations is not scalable while exploiting the nodes ordering in graphs, does not consider their attributes. For this purpose we propose a Unified Locality Sensitive Hashing (ULSH) approach to approximate the SSNs, since in graphs LSH can assemble the nodes based on neighborhood similarity only. Further using Minimum Description Length (MDL) principle, we propose a Unified Graph Summarization (UGS) technique to perform lossless compression of each set by creating a super node or adding a new virtual node in the graph. We compare our approach with two state of the art methods by experiments on synthetic and publically available real world graphs and observe very encouraging results.