Hello guys, how’s it going
Today we are going to train a cascadeDetector, which returns an XML file. We can use that XML file to detect objects, cars (only from side-view) in this case, in an image.
As we are going to use matlab, I assume you have matlab installed on your PC along with image processing and computer vision toolboxes. The whole post is of two steps:
- Train our cascade detector with all the data files.
- Use the output XML file to detect objects in a pic.
The following pic. says it all.
Before going into the topics, lets see what we are going to build:
1. Training the cascade file.
First things first, to train a cascade detector we need a dataSet. A dataSet contains a lot of positive and negative images for a specific object. So, download the image dataBase from here. You can see a lot of image files (.pgm) in folders testImages, trainImages. You can get an overview by reading the ‘README.txt’ file in that downloaded folder. In this part we are concentrating only on trainImages folder and in next part we get onto teatImages. Make new folders ‘trainImagesNeg’ and ‘trainImagesPos’ and remember the path. Copy&Paste or Cut&Paste the pictures in ‘trainImages’ folder to these new folders. (you may know all the negative pictures are named neg-n.pgm and positive pictures as pos-n.pgm if you read that ‘.txt’ file)
So, here is the line to train your data:
trainCascadeObjectDetector('carTraindata4.xml', mydata, negativeFolder);
So, what’s with those arguments? Where the heck are they initialized.
Here we go, the first argument, a xml file is going to be saved in our current directory, so that we can use it for detecting objects. You can name it as you wish, but don’t forget the extension ‘.xml’. Next argument is actually a struct in matlab, which is the data of all positive images. It contains two fields namely imageFilename and objectBoundingBoxes. Size of this struct would be 1x(no. of pos images), 1×550 in this case as we have 550 pos images. Have a look at this:
In the first field, the path of all 550 pos images are entered and in the second field the bounding boxes of our image of interest. As we got this whole data of images from a dataSets site, rather than collecting from internet, we don’t need to take that huge task of manually putting that values of bounding boxes into second field. (Thank God) Those values in second field are like [x of top-left point, y of top-left, width, height]. All the pictures in the dataSet are of size (100,40), and are already cropped to the image of interest. So, we can just select the whole pic by giving arguments as [1, 1, 100, 40]. And add that folder trainImagesPos to matlab path by right-clicking on it and click addpath.
Okay, I see where this is going. You mean I should do this for 550 times? (as there are 550 pos images)
It’s absolutely your wish or you could use this for loop after initializing the struct ‘mydata’- (code is self-explanatory)
mydata= struct('imageFilename', 'Just a random string', 'objectBoundingBoxes', 'Just a random string'); for i=0:549, mydata(i+1).imageFilename = strcat('trainImagesPos/pos-', num2str(i), '.pgm'); mydata(i+1).objectBoundingBoxes = [1, 1, 100, 40] end
Now, the whole thing with the second argument ‘mydata’ is closed. As the name suggests the third argument ‘negativeFolder’ is just a folder containing negative images. There is no need of bounding boxes for negative images. So, no need of thing like struct. Just assign the folder path to this variable named negativeFolder-
negativeFolder= fullfile('C:\Users\Surya Teja Cheedella\Documents\MATLAB\carDetection\carData\trainImagesNeg')
For a good training, there should be a large number of negative images. As the number of neg. images in the dataSet are relatively low, I copy&pasted a lot of my personal images into that trainImagesNeg folder (make sure they don’t have pics of cars in side-view).
You can learn more about this function trainCascadeObjectDetector here.
Now, run the code with all arguments initialized. It took around 40 mins. to complete 13 stages of training on my laptop and returned a xml file.
Stages? What do you mean by them? Where did they come from?
2. Detecting objects in an image.
After successful training, we can use the xml file to detect objects (cars in this case) in a picture. These lines of code will do that for us:
%initialising the variable detector with the xml file detector= vision.CascadeObjectDetector('carTraindata3.xml'); %reading an image file in current dir. img= imread('sun.png'); %bounding box around detected object box= step(detector, img); %inserting that bounding box in given picture and showing it figure,imshow(insertObjectAnnotation(img, 'rectangle', box,' '));
I have manually tested my trained xml file with all the pics in the testImages folder. It has an accuracy of 93% and out of 180 images these are the statistics:
- False Positives- 10 (single object in 120 pics and double objects in remaining)
- True Negatives- 9
Here is the code (just a for loop) to detect a large number of images and display them-
for j= 1:100, img= imread(strcat('test-', num2str(j-1), '.pgm')); bbox= step(detector, img); figure,imshow(insertObjectAnnotation(img, 'rectangle', bbox,' ')); pause(0.5); end
As usual my training has a small defect. You can understand by seeing the pic below 😛
So, Happy Training!