Abstract
Linked Open Data (LOD) is a graph-based repository of data that uses data representation format called Resource Description Framework (RDF). The basic piece of RDF data is a triple subject-property-object. LOD seen as a network of interconnected pieces of data creates an environment suitable for developing methods enabling learning processes that rely on data integration. Application of frequentionistic-based approaches to integrate data leads to identification of pieces of information that are consistent and frequently used. An essential element of such methods is the ability to identify similar pieces of data. In reality, multiple sources of information use different vocabularies to represent relations (properties) existing between data. That introduces a challenge for data integration methods.
In this paper, we propose a simple approach to determine degrees of equivalences between relations (properties) defined by different LOD vocabularies. We process numbers of occurrences of matching pairs of RDF triples in order to determine intervals representing lower and upper levels of property equivalences. As the result, we obtain a graph of equivalent properties where interval-based strength of edges represent degrees of similarity between properties. A case study illustrating the details of the approach and a validation experiment are included.