Gesture Recognition Virtual Mouse Using OpenCV

Gesture Recognition Virtual Mouse Using OpenCV

In My Last OpenCV tutorial, I wrote a program to detect green objects and track them. In this post, I am going to show you how we can extend that idea to do some more things like gesture recognition. We will apply that to create a virtual mouse.

I will be using that code as a base of this program and will work on top of it. So if you haven’t read the previous tutorial you can check it here 

Libraries In Use

The external libraries that we will be using:

  • OpenCV
  • NumPy
  • wx
  • pynput

Let's Take a Look Back At The Colour Detection Code

This was the color detection code. If you don’t know how it's working check that post first

import cv2
import numpy as np

lowerBound=np.array([33,80,40])
upperBound=np.array([102,255,255])

cam= cv2.VideoCapture(0)
kernelOpen=np.ones((5,5))
kernelClose=np.ones((20,20))

font=cv2.cv.InitFont(cv2.cv.CV_FONT_HERSHEY_SIMPLEX,2,0.5,0,3,1)

while True:
    ret, img=cam.read()
    img=cv2.resize(img,(340,220))

    #convert BGR to HSV
    imgHSV= cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
    # create the Mask
    mask=cv2.inRange(imgHSV,lowerBound,upperBound)
    #morphology
    maskOpen=cv2.morphologyEx(mask,cv2.MORPH_OPEN,kernelOpen)
    maskClose=cv2.morphologyEx(maskOpen,cv2.MORPH_CLOSE,kernelClose)

    maskFinal=maskClose
    conts,h=cv2.findContours(maskFinal.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)

    cv2.drawContours(img,conts,-1,(255,0,0),3)
    for i in range(len(conts)):
        x,y,w,h=cv2.boundingRect(conts[i])
        cv2.rectangle(img,(x,y),(x+w,y+h),(0,0,255), 2)
        cv2.cv.PutText(cv2.cv.fromarray(img), str(i+1),(x,y+h),font,(0,255,255))
    cv2.imshow("maskClose",maskClose)
    cv2.imshow("maskOpen",maskOpen)
    cv2.imshow("mask",mask)
    cv2.imshow("cam",img)
    cv2.waitKey(10)

Let's Create a Virtual Mouse with Gesture Recognition

Okay let's start modifying the above code for gesture recognition

Import libraries

import cv2
import numpy as np
from pynput.mouse import Button, Controller
import wx

These are the libraries that we will be using. pynput to control mouse movements and clicking and wx to get the display resolution of the monitor

Global variables Setup

now that we already have all the libraries let's set up all the variables and objects

mouse=Controller()
app=wx.App(False)
(sx,sy)=wx.GetDisplaySize()
(camx,camy)=(320,240)

we will need these variables and objects, mouse object is for mouse movements and to get the screen resolution we need a wx app then we can use them wx.GetDisplaySize() to get the screen resolution.

lastly, we are setting some variables camxcamy to set the captured image resolution. we will be using it later in the image resize function

Let's Start The Main Loop

while True:
    ret, img=cam.read()
    img=cv2.resize(img,(camx,camy))

    #convert BGR to HSV
    imgHSV= cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
    # create the Mask
    mask=cv2.inRange(imgHSV,lowerBound,upperBound)
    #morphology
    maskOpen=cv2.morphologyEx(mask,cv2.MORPH_OPEN,kernelOpen)
    maskClose=cv2.morphologyEx(maskOpen,cv2.MORPH_CLOSE,kernelClose)

    maskFinal=maskClose
    conts,h=cv2.findContours(maskFinal.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)

The above code is the portion of the loop we wrote in our color detection program. We don’t need to modify the loop till this point. We will be adding our code after this.

while True:
    :
    :

    if(len(conts)==2):
        # logic for the open gesture, move mouse without click
        ....
        ....
    elif(len(conts)==1):
        # logic for close gesture
        ....
        ....
    cv2.imshow("cam",img)
    cv2.waitKey(5)

Above is the structure of our extended code. after getting the contours in conts variable we will check if there are contours of 2 objects present in the frame we will move the mouse but we won't perform any click operation

similarly, if there is only one object contour present we will move the mouse as well as we will perform click operations

Implement The Open Gesture Operation

virtual mouse gesture recognition

Open Gesture

To Implement the open gesture we need to do some calculations to find some coordinates.  See the below image to get the idea

We have to first calculate the center of both detected green objects which we can easily do by taking the average of the bounding boxes maximum and minimum points. now we got 2 coordinates from the center of the 2 objects we will find the average of that and we will get the redpoint shown in the image.. okay let's do this

while True:
    :
    :
    if(len(conts)==2):
        # logic for the open gesture, move mouse without click
        x1,y1,w1,h1=cv2.boundingRect(conts[0])
        x2,y2,w2,h2=cv2.boundingRect(conts[1])
        # drawing rectangle over the objects
        cv2.rectangle(img,(x1,y1),(x1+w1,y1+h1),(255,0,0),2)
        cv2.rectangle(img,(x2,y2),(x2+w2,y2+h2),(255,0,0),2)
        #centre coordinate of first object
        cx1=x1+w1/2
        cy1=y1+h1/2
        # centre coordinate of the 2nd object
        cx2=x2+w2/2
        cy2=y2+h2/2
        # centre coordinate of the line connection both points
        cx=(cx1+cx2)/2
        cy=(cy1+cy2)/2
        # Drawing the line
        cv2.line(img, (cx1,cy1),(cx2,cy2),(255,0,0),2)
        # Drawing the point (red dot)
        cv2.circle(img, (cx,cy),2,(0,0,255),2)

So the above code is the result of what I just explained earlier and with this, we have the coordinate to position our mouse cursor

Now we need to position our mouse cursor according to the calculated coordinate okay let's do that

while True:
    :
    :
    if(len(conts)==2):
        :
        :
        mouse.release(Button.left)
        mouseLoc=(sx-(cx*sx/camx), cy*sy/camy)
        mouse.position=mouseLoc
        while mouse.position!=mouseLoc:
            pass

In the above code first, we are doing a mouse release to ensure the mouse left button is not pressed. Then we are converting the detected coordinate from camera resolution to the actual screen resolution. After that, we set the location as the mouse.position. but to move the mouse it will take time for the curser so we have to wait till the curser reaches that point. So we started a loop and we are not doing anything there we are just waiting will the current mouse location is the same as the assigned mouse location, that is for the open gesture

Implement Close Gesture/ Clicking

Now let's implement the close gesture where we will be clicking the object and dragging it

while True:
    :
    :
    if(len(conts)==2):
        :
        :
    elif(len(conts)==1):
        x,y,w,h=cv2.boundingRect(conts[0])
        #drawing the rectangle
        cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
        cx=x+w/2
        cy=y+h/2
        cv2.circle(img,(cx,cy),(w+h)/4,(0,0,255),2)

        mouse.press(Button.left)
        mouseLoc=(sx-(cx*sx/camx), cy*sy/camy)
        mouse.position=mouseLoc
        while mouse.position!=mouseLoc:
            pass

Python

The above code is similar to the open gesture, but the difference is we only have one object here so we only need to calculate the center of it. And that will be where we will position our mouse pointer. Also, we are performing a mouse press operation instead of a mouse release operation. The rest of the part is the same as the earlier one.

This is the result :

virtual mouse gesture recognition

Close Gesture

Some Fine Tuning

We are almost done. The code is almost perfect except. we won't be able to drag anything. because in close gesture we are continuously performing mouse.press operation which will result in continuous multiple clicks while dragging.

To solve this problem what we can do is, we will be putting a flag called “pinchFlag” and we will set that 1 once we perform a click operation. and we won't perform mouse press operation anymore until the flag is 0 again

so the code will look like this

pinchFlag=0# setting initial value
while True:
    :
    :
    if(len(conts)==2):
        :
        :
        if(pinchFlag==1): #perform only if pinch is on
            pinchFlag=0 # setting pinch flag off
            mouse.release(Button.left)
        mouseLoc=(sx-(cx*sx/camx), cy*sy/camy)
        mouse.position=mouseLoc
        while mouse.position!=mouseLoc:
            pass
    elif(len(conts)==1):
        :
        :
        if(pinchFlag==0): #perform only if pinch is off
            pinchFlag=1 # setting pinch flag on
            mouse.press(Button.left)
        mouseLoc=(sx-(cx*sx/camx), cy*sy/camy)
        mouse.position=mouseLoc
        while mouse.position!=mouseLoc:
            pass

Final Code For Virtual Mouse with Gesture Recognition

That pretty much it. I recorded a video tutorial for this which will fix some more common problems you can check that out below

but for now, the final code looks like this

import cv2
import numpy as np
from pynput.mouse import Button, Controller
import wx
mouse=Controller()

app=wx.App(False)
(sx,sy)=wx.GetDisplaySize()
(camx,camy)=(320,240)

lowerBound=np.array([33,80,40])
upperBound=np.array([102,255,255])

cam= cv2.VideoCapture(0)

kernelOpen=np.ones((5,5))
kernelClose=np.ones((20,20))
pinchFlag=0

while True:
    ret, img=cam.read()
    img=cv2.resize(img,(340,220))

    #convert BGR to HSV
    imgHSV= cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
    # create the Mask
    mask=cv2.inRange(imgHSV,lowerBound,upperBound)
    #morphology
    maskOpen=cv2.morphologyEx(mask,cv2.MORPH_OPEN,kernelOpen)
    maskClose=cv2.morphologyEx(maskOpen,cv2.MORPH_CLOSE,kernelClose)

    maskFinal=maskClose
    conts,h=cv2.findContours(maskFinal.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)

    if(len(conts)==2):
        if(pinchFlag==1):
            pinchFlag=0
            mouse.release(Button.left)
        x1,y1,w1,h1=cv2.boundingRect(conts[0])
        x2,y2,w2,h2=cv2.boundingRect(conts[1])
        cv2.rectangle(img,(x1,y1),(x1+w1,y1+h1),(255,0,0),2)
        cv2.rectangle(img,(x2,y2),(x2+w2,y2+h2),(255,0,0),2)
        cx1=x1+w1/2
        cy1=y1+h1/2
        cx2=x2+w2/2
        cy2=y2+h2/2
        cx=(cx1+cx2)/2
        cy=(cy1+cy2)/2
        cv2.line(img, (cx1,cy1),(cx2,cy2),(255,0,0),2)
        cv2.circle(img, (cx,cy),2,(0,0,255),2)
        mouseLoc=(sx-(cx*sx/camx), cy*sy/camy)
        mouse.position=mouseLoc
        while mouse.position!=mouseLoc:
            pass
    elif(len(conts)==1):
        x,y,w,h=cv2.boundingRect(conts[0])
        if(pinchFlag==0):
            pinchFlag=1
            mouse.press(Button.left)
        cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
        cx=x+w/2
        cy=y+h/2
        cv2.circle(img,(cx,cy),(w+h)/4,(0,0,255),2)
        mouseLoc=(sx-(cx*sx/camx), cy*sy/camy)
        mouse.position=mouseLoc
        while mouse.position!=mouseLoc:
            pass
    cv2.imshow("cam",img)
    cv2.waitKey(5)

Virtual Mouse with Gesture Recognition Video Tutor

Simple Gesture Recognition To Create Virtual Mouse | using OpenCV and Python (Tutorial) Part 1

Copyright 2021 The Codacus