Abstract
A “term weighting” is a useful technique for keyword extraction and document classification. The traditional approach depends on high frequency terms, called
positive weight (
PW) function. This paper presents a new weighting method that depends on low frequency terms, called
negative weight (
NW) function. In this paper word similarity for typical verbs and objects is focused as an example for the application field.
Negative weighted inverse verb frequency (
NWIVF) function is well defined in this study and new similarity measurement is presented by combining the
NWIVF and
PWIVF (
positive weighted inverse verb frequency) functions. The proposed method is applied to 11,000 relationships between verbs and nouns extracted from a large tagged corpus. By using this new method both recall and precision have improved by 33% and 18% respectively, over the positive weight method.