Abstract
In this paper, we consider the problem of evaluating the quality of hierarchical models. This task arises due to the current researchers use subjective evaluation, such as a survey to test the goodness of a hierarchy discovered by their models. We propose three methods to evaluate the quality of hierarchy extracted from unstructured text. These methods are used to reflects three important characteristics of an optimal tree: (1) Coverage which reflects a topic on a high level, close to the root node, should cover a wider range of sub-concepts than those on a lower level; (2) Parent-child relentless which means the parent topic in the tree should be semantically related to its children rather than to its non-children; (3) Topic coherence that identifies all words within a topic should be semantically related to the other words. Moreover, we introduce a new metric called, Interest-based coherent to evaluate the hierarchical tree extracted from structured data like relational data. We compare different state-of-art methods and perform extensive experiments on three real datasets. The results confirm that the proposed methods can properly evaluate the quality of the hierarchy discovered by several models.