Homework 1 - due Sept 24

In this homework we will be implementing a bag of features naive Bayes model to classify images.

First download this dataset. This consists of images for 5 categories from the Caltech 101 dataset. This is the same dataset used in HW0. To open this file in linux use the command: tar zxvf HW0_data.tar.gz

In Matlab:

  1. The first 15 images from each category will be used as your training set.
  2. Detect SIFT features in the training images using this code. See the README file for how to compile the code. Their function, sift_demo.m shows how to compute SIFT features on images.
  3. Cluster the detected SIFT features in your training images into 500 visual words using k-means (you can implement k-means clustering yourself, find code on the web, or use this code README included detailing parameters). Please note what code you use for clustering.
  4. Label each feature in the training and testing images as the closest visual word. Construct a bag of keypoints, histogram, for each image which counts the number of features assigned to each visual word.
  5. Train a naive Bayes model for classification. Here your feature vector is the bag of keypoints. You will need to compute the parameters of this model, P(C) and P(Fi|C). You may assume that P(C) is uniform, but should compute P(Fi|C) from the training data. See Lecture Slides and "Visual Categorization with Bags of Keypoints," especially equations 1 and 2 for more details. Use Laplace smoothing (eqn 2) to avoid probabilities of 0.
  6. Classify each test image as the category with maximum probability.
  7. Report your classification performance for each category and average performance across all categories. Here you should first compute the accuracy for each category individually, and then compute the average performance over all 5 object categories by taking the mean of the per category accuracies.

What to Turn in

Email to cse591@gmail.com: your code + a web page describing the results of your experiments including classification accuracy for each category, average classification accuracy across all categories, and a confusion matrix showing the percentage of times one category was confused with another (i.e. for each category x and category y, how many times an image with actual label x was labeled as category y by the algorithm). You should visualize your confusion matrix using the imagesc command and then print this figure out as a jpeg using the command print('-djpeg','confusion.jpg') for display in your web page.