In my previous posts we learnt how to use classifiers to do Face Detection and how to create a dataset to train a and use it for Face Recognition, in this post we are will looking at how to do Object Recognition to recognize an object in an image ( for example a book), using SIFT/SURF Feature extractor and Flann based KNN matcher,
Many of you already asked me for a tutorial on this, So here it is
Lets Do Object Recognition
So before we start, Lets create a new folder in our project folder, I am naming it as “Object Rec” Inside this we are going to save all our stuff about object recognition,
Now create a folder inside that name it “TrainingData”, this is where we are going ti save out training image which we are going to recognize in the live webcam
Now we need a sample image which we will be going to track or recognise

So i am using this as my training image, after you get your training image that you want to track, place that file and rename it to “TrainImg.jpg”.
[ictt-tweet-inline via=”thecodacus”]Lets Start Coding[/ictt-tweet-inline]
So we are ready with the setup, Now lets open your favourite python editor, and jump straight to object recognition code
First lets insert the libraries which is just the numpy and the cv2 library
<span class="token keyword">import</span> cv2
<span class="token keyword">import</span> numpy <span class="token keyword">as</span> np
After adding this we need to add the SIFT/SURF feature extractor, which will extract some distinct features from images as key points for our object recognition
and we will also need a feature feature matcher which will match the features from the sample/ training image with the current image from the webcam,
<span class="token keyword">detector=cv2.SIFT()
FLANN_INDEX_KDITREE=0
flannParam=dict(algorithm=FLANN_INDEX_KDITREE,tree=5)
flann=cv2.FlannBasedMatcher(flannParam,{})
</span>
In the above code we initialized the feature extractor SIFT as detector and feature matcher flann.
Now lets load the training image from the folder that we created earlier, and extract its features first,
trainImg=cv2.imread("TrainingData/TrainImg.jpg",0)
trainKP,trainDesc=detector.detectAndCompute(trainImg,None)
In the above code we used cv2.imread() to load the image which we saved earlier, and next we used the feature extractor to detect features and stored them in two variables one is trainKP which is the list of key points / coordinates of the features, and other in trainDesc which is list of descriptions of the corresponding key points
We will need these to to find visually similar objects in our live video,
Now lets initialize the camera of the VideoCapture object
cam=cv2.VideoCapture(0)
Start The Main LOOP!!
Now that we are done with all the preparation, we can start the main loop to start doing the main work
while True:
ret, QueryImgBGR=cam.read()
QueryImg=cv2.cvtColor(QueryImgBGR,cv2.COLOR_BGR2GRAY)
queryKP,queryDesc=detector.detectAndCompute(QueryImg,None)
matches=flann.knnMatch(queryDesc,trainDesc,k=2)
In the above code, we first captured a frame from the camera, then converted it to gray scale, then we extracted the features like we did in the training image,after that we used the flann feature matcher to match the features in both images, and stored the matches results in matches variable
here flann is using knn to match the features with k=2, so we will get 2 neighbors
After this we have to filter the matches to avoid false matches
goodMatch=[]
for m,n in matches:
if(m.distance<0.75*n.distance):
goodMatch.append(m)
In the above code we created an empty list named goodMatch and the we are checking distance from the most neatest neighbor m and the next neatest neighbor n, and we are considering the match is a good match if the distance from point “m” is less the 70% of the distance on point “n” and appending that point to “goodMatch”
We also need to make sure that we have enough feature matches to call these a match, for that we are going to set a threshold “MIN_MATCH_COUNT” and if the number of matches are greater than then value then only we are going to consider them as match
MIN_MATCH_COUNT=30
if(len(goodMatch)>=MIN_MATCH_COUNT):
tp=[]
qp=[]
for m in goodMatch:
tp.append(trainKP[m.trainIdx].pt)
qp.append(queryKP[m.queryIdx].pt)
tp,qp=np.float32((tp,qp))
H,status=cv2.findHomography(tp,qp,cv2.RANSAC,3.0)
h,w=trainImg.shape
trainBorder=np.float32([[[0,0],[0,h-1],[w-1,h-1],[w-1,0]]])
queryBorder=cv2.perspectiveTransform(trainBorder,H)
cv2.polylines(QueryImgBGR,[np.int32(queryBorder)],True,(0,255,0),5)
else:
print "Not Enough match found- %d/%d"%(len(goodMatch),MIN_MATCH_COUNT)
So in the above code we first check if number of matched features are more than the minimum number of threshold then we are going to do further operation,
Now we created two empty to get the coordinates of the matched features from the training image as well as from the query image, and converted that to numpy lists
the we used cv2.findHomography(tp,qp,cv2.RANSAC,3.0) to find transformation constant to translate points from training points to query image points,
now we want to draw border around the object so we want to get the coordinates of the border corners from the training image, which are (0,0), (0,h-1), (w-1,h-1),(w-1,0) where h,w is the height and width of the training image
Now using the transformation constant “H” that we got from earlier we will translate the coordinates from training image to query image,
finally we are using “cv2.polylines()” to draw the borders in the query image
Lastly if the number of features are less that the minimum match counts then we are going to print it in the screen in the else part
Now lets display the image, and close the window if loop ends
cv2.imshow('result',QueryImgBGR)
if cv2.waitKey(10)==ord('q'):
break
cam.release()
cv2.destroyAllWindows()
Complete Object Recognition Code
import cv2
import numpy as np
MIN_MATCH_COUNT=30
detector=cv2.SIFT()
FLANN_INDEX_KDITREE=0
flannParam=dict(algorithm=FLANN_INDEX_KDITREE,tree=5)
flann=cv2.FlannBasedMatcher(flannParam,{})
trainImg=cv2.imread("TrainingData/TrainImg.jpeg",0)
trainKP,trainDesc=detector.detectAndCompute(trainImg,None)
cam=cv2.VideoCapture(0)
while True:
ret, QueryImgBGR=cam.read()
QueryImg=cv2.cvtColor(QueryImgBGR,cv2.COLOR_BGR2GRAY)
queryKP,queryDesc=detector.detectAndCompute(QueryImg,None)
matches=flann.knnMatch(queryDesc,trainDesc,k=2)
goodMatch=[]
for m,n in matches:
if(m.distance<0.75*n.distance):
goodMatch.append(m)
if(len(goodMatch)>MIN_MATCH_COUNT):
tp=[]
qp=[]
for m in goodMatch:
tp.append(trainKP[m.trainIdx].pt)
qp.append(queryKP[m.queryIdx].pt)
tp,qp=np.float32((tp,qp))
H,status=cv2.findHomography(tp,qp,cv2.RANSAC,3.0)
h,w=trainImg.shape
trainBorder=np.float32([[[0,0],[0,h-1],[w-1,h-1],[w-1,0]]])
queryBorder=cv2.perspectiveTransform(trainBorder,H)
cv2.polylines(QueryImgBGR,[np.int32(queryBorder)],True,(0,255,0),5)
else:
print "Not Enough match found- %d/%d"%(len(goodMatch),MIN_MATCH_COUNT)
cv2.imshow('result',QueryImgBGR)
if cv2.waitKey(10)==ord('q'):
break
cam.release()
cv2.destroyAllWindows()
[ictt-tweet-inline via=”thecodacus”]Complete Video Guide[/ictt-tweet-inline]
Updates:
Github links:
For this code: you can visit: https://github.com/thecodacus/object-recognition-sift-surf
hi, i am using opencv 3.1.0 and i am getting the following error:
cv2.error: /home/pi/opencv_contrib-3.1.0/modules/xfeatures2d/src/sift.cpp:770: error: (-5) image is empty or has incorrect depth (!=CV_8U) in function detectAndCompute
i know that for opencv 3.1.0 the SIFT is moved to contrib repo so i also changed this line:
from detector=cv2.SIFT()
to detector = cv2.xfeatures2d.SIFT_create()
and it worked nice but now i am getting above error.
okay try to convert the image to grayscale first then try to compute, i thing there is some dimension issue
hi anirban,i’m using opencv 3.1 ,but i put SURF instead of SIFT function and when i run the object detection code it shows error like this
OpenCV Error: Assertion failed (The data should normally be NULL!) in allocatfile /home/pi/opencv-3.1.0/modules/python/src2/cv2.cpp, line 163
Traceback (most recent call last):
File “obj-det.py”, line 23, in
matches=flann.knnMatch(queryDesc,trainDesc,k=2)
cv2.error: /home/pi/opencv-3.1.0/modules/python/src2/cv2.cpp:163: error: (-21The data should normally be NULL! in function allocate
please help me out and thanks for your reply for my previous post.
as i said try to use lower resolution images
hi, i am using opencv 2.7.13 and i am getting the following error:
Traceback (most recent call last):
File “C:/Python27/best.py”, line 6, in
detector=cv2.SIFT()
AttributeError: ‘module’ object has no attribute ‘SIFT’
You installed python 2.7.13 and opencv3, you can check it by executing this line in python interpreter “print cv2.__version__”
in opencv3 you have to build the contrib repo located at https://github.com/Itseez/opencv_contrib and are not included with opencv by default.
hi i am getting the same error in python 2.7. my opencv version is 2.4.9.1
Hey. Thank you for the code. When I run “cv2.imshow(‘result’,QueryImgBGR)” I get the picture shown, but if I run “cv2.imwrite(‘result.jpg’,QueryImgBGR)” I do not get the same picture. What happends here, and I want to write the picture tha imshow() gives me, how do I do that?
I think you are writing the image before drawing anything on it.. otherwise i can see why it is different
Hi,
Great tutorial! Do you have one also for finding an object in an image instead of the webcamfeed?
Love to see more!
Its simple just use “cv2.imread()” to read the image instead of getting the frame from the webcam
Hello Anirban
I leave a comment but my comment disappeared;;
well I’m first to opencv and my first language is not an English please understand!
I want to detect paper box! so I got 655 pictures of it ( for machine learning)
but after read all your posts, it looks like I need xml file for grab the box in my pictures
so if I want to make dataset of my box
What should I do? I really need your help 🙂
Hi peter in my tutorial i haven’t used any xml file to detect the object.. that what i used for face detection.. you dont need that here
Hi… I applied the code in my system and I am getting “segmentation fault python” every time . Can you suggest me an efficient way to deal with this error
i might stuck in some infinite loop… can you paste the code here
import cv2
import numpy as np
MIN_MATCH_COUNT=35
detector=cv2.xfeatures2d.SURF_create()
FLANN_INDEX_KDITREE=0
flannParam=dict(algorithm=FLANN_INDEX_KDITREE,tree=5)
flann=cv2.FlannBasedMatcher(flannParam,{})
trainImg=cv2.imread(“download.jpg”,0)
trainKP,trainDesc=detector.detectAndCompute(trainImg,None)
cam=cv2.VideoCapture(0)
while True:
ret, QueryImgBGR=cam.read()
QueryImg=cv2.cvtColor(QueryImgBGR,cv2.COLOR_BGR2GRAY)
queryKP,queryDesc=detector.detectAndCompute(QueryImg,None)
matches=flann.knnMatch(queryDesc,trainDesc,k=2)
goodMatch=[]
for m,n in matches:
if(m.distanceMIN_MATCH_COUNT):
tp=[]
qp=[]
for m in goodMatch:
tp.append(trainKP[m.trainIdx].pt)
qp.append(queryKP[m.queryIdx].pt)
tp,qp=np.float32((tp,qp))
H,status=cv2.findHomography(tp,qp,cv2.RANSAC,3.0)
h,w=trainImg.shape
trainBorder=np.float32([[[0,0],[0,h-1],[w-1,h-1],[w-1,0]]])
queryBorder=cv2.perspectiveTransform(trainBorder,H)
cv2.polylines(QueryImgBGR,[np.int32(queryBorder)],True,(0,255,0),5)
else:
print (“Not Enough match found- %d/%d”%(len(goodMatch),MIN_MATCH_COUNT))
cv2.imshow(‘result’,QueryImgBGR)
if cv2.waitKey(10)==ord(‘q’):
break
cam.release()
cv2.destroyAllWindows()
sorry I wanted to know the indentation was correct seems like the comment system is removing the indentation.. can you take a screenshot and post it. that will be better
https://uploads.disquscdn.com/images/b5455f5cbafc86a39f37ef5384ed5ffbf70ed85b7a12803a1e7e8e864aff7ee8.png https://uploads.disquscdn.com/images/8513f36cd5af928c1e934267c70211bc1e453e13e726c6baaa2abf48aa81d7ba.png
here are the screenshots
what is the size of your training image and query image ?.. some how your program is getting out of memory
Training image is of 13.2kb and shape 300 * 320.. Query image is taken from webcam
web cam gives frame size of 1366*768
try resizing the image to a small number
resizing the training image?
the query image, the one with 1366*768 pixels
I haven’t used any query image. i am taking it directly from webcam
thats what i am calling query image, resize that to something smaller after capturing and before processing
okay
you can use cv2.resize() to resize it to proper size
you are using opencv3.. so there might be some changes.. try one thing, put some print commands in between the code with different numbers printings… and see exactly when the code is stopping
I did that.. For 3-4 secs it is printing all the numbers in the code and suddenly it is terminating at any random number
FLANN_INDEX_KDITREE=0 is causing a problem for me, it gives an error and says SIFT can not find it
check spelling.. you might be typing different thing in two places
check the spelling… and flann has nothing to do with sift both are separate module.. there might be something else
i wrote the code by myself first, then i used your code ,,, the same result,,, i am using python 2.7 and opencv, in opencv decontamination i did not find anything useful about it
Can you paste the exact error msg.. I will understand it better that way
I am using opencv 2.4.13 version… I also see the error message like this:
line 5, in
detector=cv2.SIFT()
AttributeError: ‘module’ object has no attribute ‘SIFT’
What should I do?
to be sure that you are using OpenCV 2.4
put this line after “import cv2”
print cv2.__version__
Getting the same error and i am using open cv 3. what should i do?
For openCV versions 3.0 and later you will need to use cv2.xfeatures2d.SIFT_create() for this. If that doesn’t work, make sure you that you installed the opencv_contrib directory from git when you built openCV
hi i had another ERROR while running the CODE AND im using opencv 3.1 and python 2.7
OpenCV Error: Insufficient memory (Failed to allocate 32327680 bytes) in OutOfMe moryError, file /home/pi/opencv-3.1.0/modules/core/src/alloc.cpp, line 52
OpenCV Error: Assertion failed (u != 0) in create, file /home/pi/opencv-3.1.0/mo dules/core/src/matrix.cpp, line 424
Traceback (most recent call last):
File “obj-det.py”, line 15, in
trainKP,trainDesc=detector.detectAndCompute(trainImg,None)
cv2.error: /home/pi/opencv-3.1.0/modules/core/src/matrix.cpp:424: error: (-215) u != 0 in function create
use lower resolutions for the training image and also for the live captured one….
thanks bro
then in what way i could train high resolution images??
again i got this new error pls help me
OpenCV Error: Assertion failed (The data should normally be NULL!) in allocate, file /home/pi/opencv-3.1.0/modules/python/src2/cv2.cpp, line 163
Traceback (most recent call last):
File “obj-det.py”, line 23, in
matches=flann.knnMatch(queryDesc,trainDesc,k=2)
cv2.error: /home/pi/opencv-3.1.0/modules/python/src2/cv2.cpp:163: error: (-215) The data should normally be NULL! in function allocate
Getting this error. how to resolve this error. kindly help out
cv2.error: C:projectsopencv-pythonopencv_contribmodulesxfeatures2dsrcsift.cpp:1044: error: (-5) image is empty or has incorrect depth (!=CV_8U) in function cv::xfeatures2d::SIFT_Impl::detectAndCompute
Hi
After running this I will get a segmentation fault and the program will crash. I am using openCV3.0 and python2.7. I’ve tried resizing the training and query images, but the segmentation fault is still occurring. I put in print statements to see where it appeared to be happening and it looks like the crash happens in
The crash occurs at random times from the beginning of the program; some runs last a minute, some only a few seconds.
Any ideas on solving this problem? Thanks
Hello
You done a great job.
I got a problem, can you help me out, as i’m not good with it.
Traceback (most recent call last):
File “C:UsersHariDownloadsLego recognitionObjectDetector.py”, line 12, in
trainKP,trainDesc=detector.detectAndCompute(trainImg,None)
error: C:build2_4_winpack-bindings-win32-vc14-staticopencvmodulesnonfreesrcsift.cpp:724: error: (-5) image is empty or has incorrect depth (!=CV_8U) in function cv::SIFT::operator ().
Thanks
detector=cv2.xfeatures2d.SIFT_create()
How use the face recognition and object recognition in the same windows ? thanks!
Hi, I am getting an ‘invalid sytax’ error with the ” mark on the 3rd line from the botto, after
print “Not enough matches – %d/%d
Any ideas??
needs parenthesis around print in python 3
print(“…………..%d/%d”%(……………..))
Hi,
This example works great.
Does anyone know how this example could be modified so that it identifies objects that ARE NOT in the reference image. For example, the train image is a photo of a room, then we watch the room and draw lines around anything new in the room that was not in the train image.
Also, same thing in reverse, highlight spots where there was something and now its gone.
Hey,
I tried your code and it work perfectly fine for me. Is it possible to find the distance of the object from the camera?
Hey,
I tried this code and I am getting this error. How to resolve this???
Traceback (most recent call last):
File “C:opencvsourcesmodulesnonfreesrcObject Detectorobjectdetector.py”, line 12, in
trainKP,trainDesc=detector.detectAndCompute(trainimg,None)
error: ……..opencvmodulesnonfreesrcsift.cpp:722: error: (-5) image is empty or has incorrect depth (!=CV_8U) in function cv::SIFT::operator ()
getting error:
File “”, line 1, in
AttributeError: ‘module’ object has no attribute ‘SURF’
How can I use this code to compare two objects with what the webcam detect instead of only one? I Mean the webcam detect two objects