Abstract
We propose an image annotation approach that relies on fuzzy clustering and feature discrimination, a greedy selection and joining algorithm (GSJ), and Bayes rule. Clustering is used to group image regions into prototypical region clusters that summarize the training data and can be used as the basis of annotating new test images. Since this problem involves clustering sparse and high dimensional data, we use a semi-supervised constrained clustering algorithm that performs simultaneous clustering and feature discrimination. The constraints consist of pairs of image regions that should not be included in the same cluster. These constraints are deduced from the irrelevance of all concepts annotating the training images. The constraints help in guiding the clustering process. The GSJ algorithm uses the fuzzy membership function of each region cluster. Finally, Bayes rule is used to label images based on the posterior probability of each concept. The proposed algorithm was implemented and tested on a data set that includes 3000 images using four-fold cross validation.