![]() ![]() This approach is well-known as the bag-of-visual-words (BoVW) model, and its variants have achieved excellent performance on several tasks, such as object recognition and image retrieval. The model in image recognition follows the same procedure as the BOW to represent image feature vectors. For images, common visual words, called codebook, are constructed by clustering local features extracted from various images. In the NLP, specifically, the bag-of-words (BOW) model expresses a document feature vector by assigning words existing in sentences to corresponding common words and counting their frequencies. By treating local features as visual vocabularies appeared in an image, images can be processed in the same way as the natural language processing (NLP). The basic idea of codebook-based encodings is to capture the statistics of the distribution of local features extracted from an image. This detects regions of interest on an image and describes a discriminative feature vector from each region. In recent image recognition problems, a local feature framework is a key technique. This paper focusses on clustering in image recognition algorithms and presents an efficient objective. Each clustering technique has a specific objective to make groups, such as finding groups that minimize a quantization error and estimation of the appropriate distribution. The main purpose of clustering is to make groups called clusters. On the other hand, the FV using our approach is able to improve the performance, especially in a larger codebook size.Ĭlustering is a fundamental technique for several purposes such as statistical analysis and data mining. For the results of the VLAD with our approach, the recognition performances tend to be worse compared to the original VLAD results. In the experiment on image recognition issues, two state-of-the-art encodings, the Fisher Vector (FV) using the GMM and the Vector of Locally Aggregated Descriptors (VLAD) using the k-means, are evaluated with two publicly available image datasets, the Birds and the Butterflies. Our approach alternated to the GMM significantly improves our objective and constructs intuitively appropriate clusters, especially for huge and complicatedly distributed samples. In the experiment section, although our approach alternated to the k-means generates similar results to the k-means results, our approach is able to finely tune clusters for our objective. Our approach is first evaluated with synthetic clustering datasets to analyze a difference to traditional clustering. This paper focusses on the disadvantage from a perspective of the distribution of prior probabilities and presents a clustering framework including two objectives that are alternated to the k-means and the GMM. ![]() A codebook size is an important factor to decide the trade-off between recognition performance and computational complexity and a traditional framework has the disadvantage to image recognition issues when a large codebook the number of unique clusters becomes smaller than a designated codebook size because some clusters converge to close positions. A codebook is usually constructed by clusterings, such as the k-means and the Gaussian Mixture Model (GMM). Codebook-based feature encodings are a standard framework for image recognition issues.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |