How can I locate something on my screen quickly in Python? How can I locate something on my screen quickly in Python? python-3.x python-3.x

How can I locate something on my screen quickly in Python?


The official documentation says it should take 1-2 seconds on a 1920x1080 screen, so your time seems to be a bit slow. I would try to optimize:

  • Use grayscaling unless color information is important (grayscale=True is supposed to give 30%-ish speedup)
  • Use a smaller image to locate (like only a part if this is already uniquely identifying the position you need to get)
  • Don't load the image you need to locate from file everytime new but keep it in memory
  • Pass in a region argument if you already know something about the possible locations (e.g. from previous runs)

This is all described in the documentation linked above.

Is this is still not fast enough you can check the sources of pyautogui, see that locate on screen uses a specific algorithm (Knuth-Morris-Pratt search algorithm) implemented in Python. So implementing this part in C, may result in quite a pronounced speedup.


make a function and use threading confidence (requires opencv)

import pyautoguiimport threadingdef locate_cat():    cat=None    while cat is None:        cat = pyautogui.locateOnScreen('Pictures/cat.png',confidence=.65,region=(1722,748, 200,450)        return cat

you can use the region argument if you know the rough location of where it is on screen

there may be some instances where you can locate on screen and assign the region to a variable and use region=somevar as an argument so it looks in the same place it found it last time to help speed up the detection process.

eg:

import pyautoguidef first_find():    front_door = None    while front_door is None:        front_door_save=pyautogui.locateOnScreen('frontdoor.png',confidence=.95,region=1722,748, 200,450)        front_door=front_door_save        return front_door_savedef second_find():    front_door=None    while front_door is None:        front_door = pyautogui.locateOnScreen('frontdoor.png',confidence=.95,region=front_door_save)        return front_doordef find_person():    person=None    while person is None:        person= pyautogui.locateOnScreen('person.png',confidence=.95,region=front_door)while True:    first_find()    second_find()    if front_door is None:        pass    if front_door is not None:        find_person()


I faced the same issue with pyautogui. Though it is a very convenient library, it is quite slow.

I gained a x10 speedup relying on cv2 and PIL:

def benchmark_opencv_pil(method):    img = ImageGrab.grab(bbox=REGION)    img_cv = cv.cvtColor(np.array(img), cv.COLOR_RGB2BGR)    res = cv.matchTemplate(img_cv, GAME_OVER_PICTURE_CV, method)    # print(res)    return (res >= 0.8).any()

Where using TM_CCOEFF_NORMED worked well. (obviously, you can also adjust the 0.8 threshold)

Source : Fast locateOnScreen with Python

For the sake of completeness, here is the full benchmark:

import pyautogui as pgimport numpy as npimport cv2 as cvfrom PIL import ImageGrab, Imageimport timeREGION = (0, 0, 400, 400)GAME_OVER_PICTURE_PIL = Image.open("./balloon_fight_game_over.png")GAME_OVER_PICTURE_CV = cv.imread('./balloon_fight_game_over.png')def timing(f):    def wrap(*args, **kwargs):        time1 = time.time()        ret = f(*args, **kwargs)        time2 = time.time()        print('{:s} function took {:.3f} ms'.format(            f.__name__, (time2-time1)*1000.0))        return ret    return wrap@timingdef benchmark_pyautogui():    res = pg.locateOnScreen(GAME_OVER_PICTURE_PIL,                            grayscale=True,  # should provied a speed up                            confidence=0.8,                            region=REGION)    return res is not None@timingdef benchmark_opencv_pil(method):    img = ImageGrab.grab(bbox=REGION)    img_cv = cv.cvtColor(np.array(img), cv.COLOR_RGB2BGR)    res = cv.matchTemplate(img_cv, GAME_OVER_PICTURE_CV, method)    # print(res)    return (res >= 0.8).any()if __name__ == "__main__":    im_pyautogui = benchmark_pyautogui()    print(im_pyautogui)    methods = ['cv.TM_CCOEFF', 'cv.TM_CCOEFF_NORMED', 'cv.TM_CCORR',               'cv.TM_CCORR_NORMED', 'cv.TM_SQDIFF', 'cv.TM_SQDIFF_NORMED']    # cv.TM_CCOEFF_NORMED actually seems to be the most relevant method    for method in methods:        print(method)        im_opencv = benchmark_opencv_pil(eval(method))        print(im_opencv)

And the results show a x10 improvement.

benchmark_pyautogui function took 175.712 msFalsecv.TM_CCOEFFbenchmark_opencv_pil function took 21.283 msTruecv.TM_CCOEFF_NORMEDbenchmark_opencv_pil function took 23.377 msFalsecv.TM_CCORRbenchmark_opencv_pil function took 20.465 msTruecv.TM_CCORR_NORMEDbenchmark_opencv_pil function took 25.347 msFalsecv.TM_SQDIFFbenchmark_opencv_pil function took 23.799 msTruecv.TM_SQDIFF_NORMEDbenchmark_opencv_pil function took 22.882 msTrue