Abstract
[Display omitted]
•Studied is rejecting option in pattern recognition problem.•Recognizing native (proper) patterns and rejecting foreign (erroneous) patterns.•Clustering in unsupervised mode to discover data structures.•Novel unsupervised mode to reject foreign patterns.•Empirical evaluations on a suite of publicly available medical datasets.
The study deals with an issue of recognition of native (proper) patterns and rejection of foreign (erroneous) patterns. We present a novel unsupervised approach to rejecting foreign patterns. We construct a geometrical model, which identifies regions in the feature space that are predominantly occupied by native patterns and determines regions where foreign patterns are localized. The model is constructed in an unsupervised mode: we engage clustering to discover structures in the data and use the revealed geometry to form regions with high likelihood of being occupied by native patterns and regions in which foreign patterns are likely to be localized. The geometry of the region of rejected patterns is adjusted by two parameters, which are tuned to achieve a sound balance between rejection of foreign patterns and acceptance of native patterns. It is shown that the proposed method is applicable not only to multiclass data processing problems, but it could also be beneficial in situations when the only available information concerns a single phenomenon (a so-called a one-class data). We demonstrate the usefulness of the proposed approach by studying several publicly available medical datasets.