Abstract
The Bag-of-visual-words (BOW) has recently become a popular representation to describe image content. Each image is represented by the frequency histogram of visual words obtained by assigning each key point of the image to the closest visual word. Overall, codebook is constructed via K-means clustering. In this paper we have used unsupervised neural network algorithm, to overcome some weaknesses of K-means; the standard Self Organizing Map SOM. We evaluated our method on two public datasets. Results exceed the current state-of-art retrieval performance with the baseline BOW on Holidays dataset, with less performance on the Kentucky dataset, however. We experimentally show that the proposed soft-weighting approach shows significant improvement over the baseline BOW with a small codebook size.