Car detection in MATLAB


Hello guys, how’s it going

Today we are going to train a cascadeDetector, which returns an XML file. We can use that XML file  to detect objects, cars (only from side-view) in this case, in an image.

As we are going to use matlab, I assume you have matlab installed on your PC along with image processing and computer vision toolboxes. The whole post is of two steps:

  1. Train our cascade detector with all the data files.
  2. Use the output XML file to detect objects in a pic.

The following pic. says it all.

Overview of what we are going to do in here.



Before going into the topics, lets see what we are going to build:

Detected correctly
This is the final output we are going to get by the end.
1.  Training the cascade file

First things first, to train a cascade detector we need a dataSet. A dataSet contains a lot of positive and negative images for a specific object. So, download the image dataBase from here. You can see a lot of image files (.pgm) in folders testImages, trainImages. You can get an overview by reading the ‘README.txt’ file in that downloaded folder. In this part we are concentrating only on trainImages folder and in next part we get onto teatImages. Make new folders ‘trainImagesNeg’ and ‘trainImagesPos’ and remember the path. Copy&Paste or Cut&Paste the pictures in ‘trainImages’ folder to these new folders. (you may know all the negative pictures are named neg-n.pgm and positive pictures as pos-n.pgm if you read that ‘.txt’ file)

So, here is the line to train your data:

trainCascadeObjectDetector('carTraindata4.xml', mydata, negativeFolder);

So, what’s with those arguments? Where the heck are they initialized.

Here we go, the first argument, a xml file is going to be saved in our current directory, so that we can use it for detecting objects. You can name it as you wish, but don’t forget the extension ‘.xml’. Next argument is actually a struct in matlab, which is the data of all positive images. It contains two fields namely imageFilename and objectBoundingBoxes. Size of this struct would be 1x(no. of pos images), 1×550 in this case as we have 550 pos images. Have a look at this:

Screenshot of struct of positive images with objectBoundingBoxes field

In the first field, the path of all 550 pos images are entered and in the second field the bounding boxes of our image of interest. As we got this whole data of images from a dataSets site, rather than collecting from internet, we don’t need to take that huge task of manually putting that values of bounding boxes into second field. (Thank God) Those values in second field are like [x of top-left point, y of top-left, width, height]. All the pictures in the dataSet are of size (100,40), and are already cropped to the image of interest. So, we can just select the whole pic by giving arguments as [1, 1, 100, 40]. And add that folder trainImagesPos to matlab path by right-clicking on it and click addpath.

Okay, I see where this is going. You mean I should do this for 550 times? (as there are 550 pos images) 

It’s absolutely your wish or you could use this for loop after initializing the struct ‘mydata’- (code is self-explanatory)

mydata= struct('imageFilename', 'Just a random string', 'objectBoundingBoxes', 'Just a random string');
for i=0:549,
 mydata(i+1).imageFilename = strcat('trainImagesPos/pos-', num2str(i), '.pgm');
 mydata(i+1).objectBoundingBoxes = [1, 1, 100, 40]


Now, the whole thing with the second argument ‘mydata’ is closed. As the name suggests the third argument ‘negativeFolder’ is just a folder containing negative images. There is no need of bounding boxes for negative images. So, no need of thing like struct. Just assign the folder path to this variable named negativeFolder-

negativeFolder= fullfile('C:\Users\Surya Teja Cheedella\Documents\MATLAB\carDetection\carData\trainImagesNeg')

For a good training, there should be a large number of negative images. As the number of neg. images in the dataSet are relatively low, I copy&pasted a lot of my personal images into that trainImagesNeg folder (make sure they don’t have pics of cars in side-view).

You can learn more about this function trainCascadeObjectDetector here.

Now, run the code with all arguments initialized. It took around 40 mins. to complete 13 stages of training on my laptop and returned a xml file.

Stages? What do you mean by them? Where did they come from?


Stages while training
An overview of what it’s gonna do in various stages.


2.  Detecting objects in an image.

After successful training, we can use the xml file to detect objects (cars in this case) in a picture. These lines of code will do that for us:

%initialising the variable detector with the xml file
detector= vision.CascadeObjectDetector('carTraindata3.xml');
%reading an image file in current dir.
img= imread('sun.png');
%bounding box around detected object
box= step(detector, img);
%inserting that bounding box in given picture and showing it
figure,imshow(insertObjectAnnotation(img, 'rectangle', box,' '));

I have manually tested my trained xml file with all the pics in the testImages folder. It has an accuracy of 93% and out of 180 images these are the statistics:

  • False Positives- 10 (single object in 120 pics and double objects in remaining)
  • True Negatives- 9

Here is the code (just a for loop) to detect a large number of images and display them-

for j= 1:100,
 img= imread(strcat('test-', num2str(j-1), '.pgm'));
 bbox= step(detector, img);
 figure,imshow(insertObjectAnnotation(img, 'rectangle', bbox,' '));

As usual my training has a small defect. You can understand by seeing the pic below 😛

Wrongly detected images


So, Happy Training!


Face Detection using openCV in Python


Hello folks, How’s it going!

Today I am going to introduce my face detection algorithm. (not a big one though) Don’t think that this is really a huge task! I am not working from scratch (means I am not actually gathering a huge data set of all pictures (both negative and positive, i.e having faces and not ) and train my algorithm). I have used haar Cascading files.

What the heck are they? Here we go.

Taking samples of a lot of image files of both types (having faces and not having faces) and train the algorithm, make it learn when ever (most probably every time 😛 ) it makes a mistake and store the whole data into xml files. In this case I am using haar Cascading xml files. As I already said they are huge, wanna know their size? 35k lines of code in each xml file! Yes, you read that right, 35K lines of random numbers. 😛

Coming to the juicy part, the code, here it is. I have commented briefly what each line is contributing.

#import numpy as np
import cv2
#import PIL

#import the xml files. here i am using frontal face and eye.
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml')

img= cv2.imread('5.jpg') #reading the image
gray_img= cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #converting into gratscale(algos work with grayscale images)

faces = face_cascade.detectMultiScale(gray_img, 1.3, 5) #detecting faces. lightning conditions may affect the output
print (faces) #just printing to console. this will print boundary points of detected face(s)

#searching for eyes only in faces. easy and efficient to search only in face rather than whole image
for (p,q,r,s) in faces:
 cv2.rectangle(img,(p,q),(p+r,q+s),(150,125,0),2)        #drawing a rectangle indicating face
 face_gray = gray_img[q:q+s, p:p+r] #cropping   face in gray image
 face_color = img[q:q+s, p:p+r] #cropping face in  color image
 eyes = eye_cascade.detectMultiScale(face_gray)  #searching for eyes in grayscale img
 for (ep,eq,er,es) in eyes:
  cv2.rectangle(face_color,(ep,eq),(ep+er,eq+es), (100,210,150),2) #for each eye drawing rectangle

cv2.imshow("output", img)
cv2.waitKey(0) #showing the img

#this only takes a image and shows the faces in the image. It dont modifies the image.
#if you want to save the resulted image use this...
#cv2.imwrite("output.jpg", img)

Wanna know what it did? I gave this image

Input image

and it returned THIS!…….

Screenshot of the output


Currently, I am working on alphabet detection thing to won in a bet with my friend.

Bieeee  ^_^

Here somewhere in Milky Way


First ML Code on Gradient descent!

Hii there,

Hmm… my first program exceeding 20 lines.

Basically, it is a gradient descent problem (don’t know much about it 😛 ). As I am taking Machine Learning course on Cousera I wanna solve some problems on ML. I found some AI problems (don’t know much about this too) on HakerRank site and started solving this one. This guy is an output of my 5 hours of work. 😀

import java.util.Scanner;

public class houseCosts {
	public static void main(String[] args){
		Scanner in = new Scanner(;
		int n = in.nextInt();
		int m = in.nextInt();
		float[][] x = new float[n+2][m];
		float[] t = new float[n+1];                       
		float[] temp = new float[n+1];
		float alpha = (float) 0.3;                      //alpha
		for(int j = 0; j<m; j++){
			for(int i=1; i<n+2; i++){
				x[i][j] = in.nextFloat();
				//System.out.println("x "+i+" "+j+"= "+x[i][j]);
		int num = in.nextInt();
		float[][] out = new float[n+1][num];
		for(int j = 0; j<num; j++){
			for(int i=1; i<n+1; i++){
				out[i][j] = in.nextFloat();
				//System.out.println("out "+i+" "+j+"= "+out[i][j]);
		for(int i=0; i<num; i++){
			out[0][i] = 1;
			//System.out.println("x "+0+" "+i+"= "+x[0][i]);
		for(int i=0; i<m; i++){
			x[0][i] = 1;
			//System.out.println("x "+0+" "+i+"= "+x[0][i]);
		for(int j=0; j<n+1; j++){
			t[j] = 0;                                  //theta value initializing
		for(int p = 0; p<500; p++){                   //no. of times
			for(int k=0; k<n+1; k++){
				float dum = 0;
				for(int j=0; j<m; j++){
					float ans = 0;
					for(int i=0; i<n+1; i++){
						ans+= t[i] * x[i][j];
					ans-= x[n+1][j];
					ans*= x[k][j];
					//System.out.println("x "+k+" "+j+" ="+x[k][j]);
				temp[k] = (float) (t[k]-(alpha * dum * (1.0/m)));
			for(int k=0; k<=n; k++){
				//System.out.print(t[k]+" ");
			//System.out.println(" ");
		for(int i = 0; i<num; i++){
			float foo = 0;
			for(int j=0; j<n+1; j++){
				foo+=out[j][i] * t[j];

Don’t judge me by this code coz, I don’t know much about algorithms and ML. And there are so many stdouts because I dunno how to debug in eclipse or any IDE for that mater. BTW, I got ten on ten for this problem.

Something Productive- CHECK!

Don’t forget to travel in time.




Birthday code! (Thanking friends on facebook using GraphAPI)

I wrote this on my birthday,

import facebook

graph = facebook.GraphAPI('Access token you got from facebook developers page')
feed = graph.get_connections("suryateja.cheedella", "feed?limit=200")
for a in range (0,200):
    post = feed["data"][a]
    fromDetails = (post["from"])
    fromName = fromDetails["name"]
    graph.put_object(post["id"], "comments", message="Thank you :d " + fromName)

and then there, it thanked all my friends who wished me on my timeline.

Hello, what’s up guys,

It’s afternoon on 4th and as usually I am browsing on quora and facebook. I stumbled upon ‘what are the best python codes you have written’ question on quora. I saw some top answers which are actually thanking friends on FB on their birthday. Actually it’s the best time to learn or just copy&paste all of that code so, I can make my thanking task easy for tomorrow. But, I wrote ‘Hello world’ program in python just two weeks back. And completed reading like 100 pages of ‘Head first python’. So, I started searching on web and got to know about some thing called GraphAPI. After spending like 5 to 6 hours, I wrote that 10 lines of code (in a perfectly working condition) and tested it.

You can say, dude why don’t you use that already written code found on quora. Basically all those codes are old. I mean those are the codes written before releasing GraphAPI by facebook. With that thing for facebook developers, one can reduce their code to 10 lines (yeah, you look their) from 40 to 50 lines of code which I have found on Quora.

That is not the best way to reach our goal (like commenting on last ‘x’ number of posts on my wall) but, that is what I can learn in that span of time.