Categories: Machine Learning

SIFT/SURF BoW for big number of clusters

If you spend some time browsing, there are some examples already available for Python SIFT/SURF bag of words (BoW) classifier in the internet. They use clustering (usually K-Means) to build dictionary of visual vocabularies (usually with sklearn or cv2 clustering library) of SIFT/SURF features. However, most of the sample codes that I found can’t properly handle big number(> 100) of vocabularies/clusters, while some papers (such as this one) shows best result are achieved using 2000+ clusters.

Building visual dictionary using cv2.BOWKMeansTrainer is super slow when using > 100 clusters. While using sklearn.cluster.KMeans solves the speed issue, it requires huge amount of memory (8 GB of RAM is still insufficient to handle > 400 clusters). That’s where klearn.cluster.MiniBatchKMeans comes into picture.

import cv2
import numpy as np
import progressbar
from sklearn.cluster import MiniBatchKMeans

...

def build_dictionary(xfeatures2d, dir_names, file_paths, dictionary_size):
  print('Computing descriptors..')        
  desc_list = []
  num_files = len(file_paths)
  bar = progressbar.ProgressBar(maxval=num_files).start()
  for i in range(num_files):
    p = file_paths[i]
    image = cv2.imread(p)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    kp, dsc = xfeatures2d.detectAndCompute(gray, None)
    desc_list.extend(dsc)
    bar.update(i)
  bar.finish()

  print('Creating BoW dictionary using K-Means clustering with k={}..'.format(dictionary_size))
  dictionary = MiniBatchKMeans(n_clusters=dictionary_size, batch_size=100, verbose=1)
  dictionary.fit(np.array(desc_list))
  return dictionary

...

# usage example
sift = cv2.xfeatures2d.SIFT_create()
dir_names = ['class1', 'class2', ...]
file_paths = ['/data/class1/1.jpg', '/data/class1/2.jpg', ..., '/data/class2/1.jpg', '/data/class2/2.jpg', ...]
dictionary_size = 2800
dictionary = build_dictionary(sift, dir_names, file_paths, dictionary_size)

Using the code above, I was able to complete whole process of raw data (2000+ images) preprocessing, building SIFT dictionary, cross-validating 6 classifiers within 52 minutes (still quite long, but acceptable ?). While using SURF, it takes around 45 minutes. The complete main files for both experiments can be found in here and here.

4.5 2 votes
Article Rating
yohanes.gultom@gmail.com

Share
Published by
yohanes.gultom@gmail.com
Tags: python

Recent Posts

Review: Robot M530 Mouse

My daily driver and one of the best investment since early 2025 until today! 🏆…

3 days ago

Review: Ziyoulang K68 Mechanical Keyboard

My daily driver and one of the best investment since 2024 until today! 🏆 ✅…

1 week ago

Get Unverified SSL Certificate Expiry Date with Python

Getting verified SSL information with Python (3.x) is very easy. Code examples for it are…

5 years ago

Spring Data Couchbase 4 Multibuckets in Spring Boot 2

By default, Spring Data Couchbase implements single-bucket configuration. In this default implementation, all POJO (Plain…

5 years ago

Firebase Auth Emulator with Python

Last year, Google released Firebase Auth Emulator as a new component in Firebase Emulator. In…

5 years ago

Google OIDC token generation/validation

One of the authentication protocol that is supported by most of Google Cloud services is…

6 years ago