Abstract
Anomaly detection in data sets is one of the most important challenges for modern analysts and data administrators. It is usually based on algorithms that use raw data. In this study, we analyze the possibilities of improving the well-known Isolation Forest algorithm based on binary search trees for data preprocessing using the grouping of both attributes first, and then records within attribute groups. Attribute clustering is based on hierarchical grouping, while record grouping uses K-Means and Fuzzy C-Means. To describe the relationships between records, data membership functions are also used, built on the basis of record distances from centroids. This approach gives a new look at the possibilities of the Isolation Forest method and leads to a significant improvement in the results for selected public databases.