Object Classification on aYahoo image dataset

1. You need to have Caffe installed on your system to run the below tasks. If not follow the tutorial here to get Caffe up and running on your system.
2. You need to download aYahoo image dataset here.
3. You need to get all the pre-trained models: alexnet.caffemodel, coffenet.caffemodel and googlenet.caffemodel to get started on extracting features from images.

We will start by getting the paths to all the downloaded images into a text file. Run this line in terminal with correct path to images’ folder and path to text file in which we need to append all the paths to images.

find `pwd`/examples/images/ayahoo -type f -exec echo {} \; > examples/tempFeaExtr/temp.txt

Now we need to label the images. I am assigning labels starting from 0. It is recommended to sort all the names in text file, so we can divide the whole dataset into train and test datasets with reasonable number of images of each class in both datasets rather than randomly the whole data.

outfile= open('zz_temp.txt', 'w')
infile= open('z_temp.txt', 'r')
myset= []

for line in infile:
 a= line[(len(line)-(line[::-1].index('/'))):-1]
 a= a[:a.index('_')]

 if a== "bag":
 b= "0"
 elif a== "building":
 b= "1"
 elif a== "carriage":
 b= "2"
 elif a== "centaur":
 b= "3"
 elif a== "donkey":
 b= "4"
 elif a== "goat":
 b= "5"
 elif a== "jetski":
 b= "6"
 elif a== "monkey":
 b= "7"
 elif a== "mug":
 b= "8"
 elif a== "statue":
 b= "9"
 elif a== "wolf":
 b= "10"
 elif a== "zebra":
 b= "11"

 outfile.write(line[:-1]+ " "+ b+ "\n")


infile= open('zz_temp.txt', 'r')
outfile= open('z_temp.txt', 'w')
myset= []

for line in infile:

for a in myset:


As we have labeled and sorted the whole database, now, we divide it into two sets namely train_data.txt and test_data.txt. Try to keep the ratio of “no. of train images/ no. of test images” constant for all the classes. Now, we have train_data and test_data to proceed on to next step.

I am going to extract features of every image using the pre-trained models. After extracting I am gonna train SVM classifiers to predict the classes of test_data.

Link to the code to extract features and to train SVMs id here on GitHub.

Happy classifying images.

Install Caffe on Ubuntu 14.04

1. As my machine has no GPU hardware, I am going to install Caffe without CUDA support.
2. I am going to install only Python wrapper for Caffe.

As the installation page on Caffe has no detailed instructions to install all the dependencies required and get your system ready to run CNNs , I am writing this small tutorial to set Caffe Up and Running on your machine.

Firstly, we will get stared by downloading latest release of Caffe from github. (You need to have Git installed on your machine)

git clone https://github.com/BVLC/caffe.git
#add these following lines in the end of your .bashrc file
export PYTHONPATH=$PYTHONPATH:$HOME/caffe/python
export PYTHONPATH=$PYTHONPATH:$HOME/caffe/python/caffe

It is advised to use Anaconda Python distribution as it installs most of the requirements for python wrapper around Caffe (though it install a lot of packages we have no use). But, installing Anaconda is up to you.
Download Anaconda and run shell script. Add the path of bin to #PATH in file ~/.bashrc.

bash Anaconda-2.1.0-Linux-x86_64.sh
#add the following line at the end to your .bashrc
export PATH=/home/suryateja/anaconda/bin:$PATH

Get gflags and install on your system.

wget https://github.com/schuhschuh/gflags/archive/master.zip
unzip master.zip
cd gflags-master
mkdir build && cd build
export CXXFLAGS="-fPIC" && cmake .. && make VERBOSE=1</pre>
sudo make install

Download google-glog and install. You can also install through apt-get. And install snappy.

tar -xvzf google-glog_0.3.3.orig.tar.gz
cd glog-0.3.3/
sudo make install
agi libgoogle-glog-dev
agi libsnappy-dev

Get leveldb onto your machine and lmdb with its dependencies.

git clone https://code.google.com/p/leveldb/
cd leveldb
cp --preserve=links libleveldb.* /usr/local/lib
sudo cp --preserve=links libleveldb.* /usr/local/lib
sudo cp -r include/leveldb /usr/local/include/
sudo ldconfig
sudo apt-get install libffi-dev python-dev build-essential
sudo apt-get install liblmdb-dev

Download Atlas and Lapack (directly downloads when you click on link!). But before installing, you need to modify your CPU frequency/scaling. You you get errors saving the file, try to use different editor. I have tried sublime, gedit. Finally with I can modify with emacs. Change the scaling for all of your CPU cores to replace the single word in the file with ‘performance’.

gksu emacs /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
gksu emacs /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
gksu emacs /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
gksu emacs /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor

bunzip2 -c atlas3.10.2.tar.bz2 | tar xfm -
mv ATLAS ATLAS3.10.2
cd ATLAS3.10.2
mkdir Linux_C2D64SSE3
cd Linux_C2D64SSE3
../configure -b 64 -D c -DPentiumCPS=2400 --prefix=/home/suryateja/lib/atlas --with-netlib-lapack-tarfile=/home/suryateja/Downloads/lapack-3.5.0.tgz

make build
make check
make ptcheck
make time
sudo make install

Install dependencies for python wrapper if you didn’t choose to install Anaconda Python distrubution. The dependencies file is in caffe/python. And install hdf5 via apt-get.

cd python/
for req in $(cat requirements.txt); do sudo pip install $req; done
sudo apt-get install libhdf5-serial-dev

Download latest 3.0.0-beta version (file automatically downloads if you click) of OpenCV and build it. But before install all the dependencies of OpenCV.

sudo apt-get install build-essential
sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev

cd Downloads
cd opencv-3.0.0-beta/
mkdir release
cd release
sudo make install

Now, we are almost done with getting all the dependencies of Caffe. So, lets dive into getting Caffe up and running on our machines. Modify the makefile as your needs as shown in installation page.

cd caffe
cp Makefile.config.example Makefile.config
make all
make test
make runtest

If you have errors while installing even followed this tutorial, you can search for any known issues on their issues page. One of the errors I faced is this. And I am able to solve it as suggested in the comments section of issue page.

At last, now, you have Caffe installed and running on your machine.
Happy training Deep Neural Networks. 😀

Car detection in MATLAB


Hello guys, how’s it going

Today we are going to train a cascadeDetector, which returns an XML file. We can use that XML file  to detect objects, cars (only from side-view) in this case, in an image.

As we are going to use matlab, I assume you have matlab installed on your PC along with image processing and computer vision toolboxes. The whole post is of two steps:

  1. Train our cascade detector with all the data files.
  2. Use the output XML file to detect objects in a pic.

The following pic. says it all.

Overview of what we are going to do in here.



Before going into the topics, lets see what we are going to build:

Detected correctly
This is the final output we are going to get by the end.
1.  Training the cascade file

First things first, to train a cascade detector we need a dataSet. A dataSet contains a lot of positive and negative images for a specific object. So, download the image dataBase from here. You can see a lot of image files (.pgm) in folders testImages, trainImages. You can get an overview by reading the ‘README.txt’ file in that downloaded folder. In this part we are concentrating only on trainImages folder and in next part we get onto teatImages. Make new folders ‘trainImagesNeg’ and ‘trainImagesPos’ and remember the path. Copy&Paste or Cut&Paste the pictures in ‘trainImages’ folder to these new folders. (you may know all the negative pictures are named neg-n.pgm and positive pictures as pos-n.pgm if you read that ‘.txt’ file)

So, here is the line to train your data:

trainCascadeObjectDetector('carTraindata4.xml', mydata, negativeFolder);

So, what’s with those arguments? Where the heck are they initialized.

Here we go, the first argument, a xml file is going to be saved in our current directory, so that we can use it for detecting objects. You can name it as you wish, but don’t forget the extension ‘.xml’. Next argument is actually a struct in matlab, which is the data of all positive images. It contains two fields namely imageFilename and objectBoundingBoxes. Size of this struct would be 1x(no. of pos images), 1×550 in this case as we have 550 pos images. Have a look at this:

Screenshot of struct of positive images with objectBoundingBoxes field

In the first field, the path of all 550 pos images are entered and in the second field the bounding boxes of our image of interest. As we got this whole data of images from a dataSets site, rather than collecting from internet, we don’t need to take that huge task of manually putting that values of bounding boxes into second field. (Thank God) Those values in second field are like [x of top-left point, y of top-left, width, height]. All the pictures in the dataSet are of size (100,40), and are already cropped to the image of interest. So, we can just select the whole pic by giving arguments as [1, 1, 100, 40]. And add that folder trainImagesPos to matlab path by right-clicking on it and click addpath.

Okay, I see where this is going. You mean I should do this for 550 times? (as there are 550 pos images) 

It’s absolutely your wish or you could use this for loop after initializing the struct ‘mydata’- (code is self-explanatory)

mydata= struct('imageFilename', 'Just a random string', 'objectBoundingBoxes', 'Just a random string');
for i=0:549,
 mydata(i+1).imageFilename = strcat('trainImagesPos/pos-', num2str(i), '.pgm');
 mydata(i+1).objectBoundingBoxes = [1, 1, 100, 40]


Now, the whole thing with the second argument ‘mydata’ is closed. As the name suggests the third argument ‘negativeFolder’ is just a folder containing negative images. There is no need of bounding boxes for negative images. So, no need of thing like struct. Just assign the folder path to this variable named negativeFolder-

negativeFolder= fullfile('C:\Users\Surya Teja Cheedella\Documents\MATLAB\carDetection\carData\trainImagesNeg')

For a good training, there should be a large number of negative images. As the number of neg. images in the dataSet are relatively low, I copy&pasted a lot of my personal images into that trainImagesNeg folder (make sure they don’t have pics of cars in side-view).

You can learn more about this function trainCascadeObjectDetector here.

Now, run the code with all arguments initialized. It took around 40 mins. to complete 13 stages of training on my laptop and returned a xml file.

Stages? What do you mean by them? Where did they come from?


Stages while training
An overview of what it’s gonna do in various stages.


2.  Detecting objects in an image.

After successful training, we can use the xml file to detect objects (cars in this case) in a picture. These lines of code will do that for us:

%initialising the variable detector with the xml file
detector= vision.CascadeObjectDetector('carTraindata3.xml');
%reading an image file in current dir.
img= imread('sun.png');
%bounding box around detected object
box= step(detector, img);
%inserting that bounding box in given picture and showing it
figure,imshow(insertObjectAnnotation(img, 'rectangle', box,' '));

I have manually tested my trained xml file with all the pics in the testImages folder. It has an accuracy of 93% and out of 180 images these are the statistics:

  • False Positives- 10 (single object in 120 pics and double objects in remaining)
  • True Negatives- 9

Here is the code (just a for loop) to detect a large number of images and display them-

for j= 1:100,
 img= imread(strcat('test-', num2str(j-1), '.pgm'));
 bbox= step(detector, img);
 figure,imshow(insertObjectAnnotation(img, 'rectangle', bbox,' '));

As usual my training has a small defect. You can understand by seeing the pic below 😛

Wrongly detected images


So, Happy Training!


First ML Code on Gradient descent!

Hii there,

Hmm… my first program exceeding 20 lines.

Basically, it is a gradient descent problem (don’t know much about it 😛 ). As I am taking Machine Learning course on Cousera I wanna solve some problems on ML. I found some AI problems (don’t know much about this too) on HakerRank site and started solving this one. This guy is an output of my 5 hours of work. 😀

import java.util.Scanner;

public class houseCosts {
	public static void main(String[] args){
		Scanner in = new Scanner(System.in);
		int n = in.nextInt();
		int m = in.nextInt();
		float[][] x = new float[n+2][m];
		float[] t = new float[n+1];                       
		float[] temp = new float[n+1];
		float alpha = (float) 0.3;                      //alpha
		for(int j = 0; j&amp;lt;m; j++){
			for(int i=1; i&amp;lt;n+2; i++){
				x[i][j] = in.nextFloat();
				//System.out.println(&amp;quot;x &amp;quot;+i+&amp;quot; &amp;quot;+j+&amp;quot;= &amp;quot;+x[i][j]);
		int num = in.nextInt();
		float[][] out = new float[n+1][num];
		for(int j = 0; j&amp;lt;num; j++){
			for(int i=1; i&amp;lt;n+1; i++){
				out[i][j] = in.nextFloat();
				//System.out.println(&amp;quot;out &amp;quot;+i+&amp;quot; &amp;quot;+j+&amp;quot;= &amp;quot;+out[i][j]);
		for(int i=0; i&amp;lt;num; i++){
			out[0][i] = 1;
			//System.out.println(&amp;quot;x &amp;quot;+0+&amp;quot; &amp;quot;+i+&amp;quot;= &amp;quot;+x[0][i]);
		for(int i=0; i&amp;lt;m; i++){
			x[0][i] = 1;
			//System.out.println(&amp;quot;x &amp;quot;+0+&amp;quot; &amp;quot;+i+&amp;quot;= &amp;quot;+x[0][i]);
		for(int j=0; j&amp;lt;n+1; j++){
			t[j] = 0;                                  //theta value initializing
		for(int p = 0; p&amp;lt;500; p++){                   //no. of times
			for(int k=0; k&amp;lt;n+1; k++){
				float dum = 0;
				for(int j=0; j&amp;lt;m; j++){
					float ans = 0;
					for(int i=0; i&amp;lt;n+1; i++){
						ans+= t[i] * x[i][j];
					ans-= x[n+1][j];
					ans*= x[k][j];
					//System.out.println(&amp;quot;x &amp;quot;+k+&amp;quot; &amp;quot;+j+&amp;quot; =&amp;quot;+x[k][j]);
				temp[k] = (float) (t[k]-(alpha * dum * (1.0/m)));
			for(int k=0; k&amp;lt;=n; k++){
				//System.out.print(t[k]+&amp;quot; &amp;quot;);
			//System.out.println(&amp;quot; &amp;quot;);
		for(int i = 0; i&amp;lt;num; i++){
			float foo = 0;
			for(int j=0; j&amp;lt;n+1; j++){
				foo+=out[j][i] * t[j];

Don’t judge me by this code coz, I don’t know much about algorithms and ML. And there are so many stdouts because I dunno how to debug in eclipse or any IDE for that mater. BTW, I got ten on ten for this problem.

Something Productive- CHECK!

Don’t forget to travel in time.