Scrapy with dynamic captcha Scrapy with dynamic captcha selenium selenium

Scrapy with dynamic captcha


Here's a complete solution to bypass the specified captcha using anticaptcha and PIL.

Due to the dynamic of this captcha, we need to grab a print screen of the img element containing the captcha. For that we use save_screenshot() and PIL to crop and save <img name="imagen"... to disk (captcha.png).
We then submit captcha.png to anti-captcha that will return the solution, i.e.:

from PIL import Imagefrom python_anticaptcha import AnticaptchaClient, ImageToTextTaskfrom selenium import webdriverdef get_captcha():    captcha_fn = "captcha.png"    element = driver.find_element_by_name("imagen") # element name containing the catcha image    location = element.location    size = element.size    driver.save_screenshot("temp.png")    x = location['x']    y = location['y']    w = size['width']    h = size['height']    width = x + w    height = y + h    im = Image.open('temp.png')    im = im.crop((int(x), int(y), int(width), int(height)))    im.save(captcha_fn)    # request anti-captcha service to decode the captcha    api_key = 'XXXXXXXXXXXXXXXXXXXXXXXXXX' # api key -> https://anti-captcha.com/    captcha_fp = open(captcha_fn, 'rb')    client = AnticaptchaClient(api_key)    task = ImageToTextTask(captcha_fp)    job = client.createTask(task)    job.join()    return job.get_captcha_text()start_url = "YOU KNOW THE URL"driver = webdriver.Chrome()driver.get(start_url)captcha = get_captcha()print( captcha )

Output:

ifds

captcha.png

enter image description here


Notes:

  • Use it at your own responsibility (be smart);
  • You can improve the code by handling exceptions properly;
  • anticaptcha is a paid service (0.5$/1000 imgs);
  • I'm not affiliated with anticaptcha.