Scrapy with dynamic captcha
Here's a complete solution to bypass the specified captcha
using anticaptcha and PIL.
Due to the dynamic of this captcha
, we need to grab a print screen of the img
element containing the captcha
. For that we use save_screenshot()
and PIL
to crop and save <img name="imagen"...
to disk (captcha.png
).
We then submit captcha.png
to anti-captcha
that will return the solution, i.e.:
from PIL import Imagefrom python_anticaptcha import AnticaptchaClient, ImageToTextTaskfrom selenium import webdriverdef get_captcha(): captcha_fn = "captcha.png" element = driver.find_element_by_name("imagen") # element name containing the catcha image location = element.location size = element.size driver.save_screenshot("temp.png") x = location['x'] y = location['y'] w = size['width'] h = size['height'] width = x + w height = y + h im = Image.open('temp.png') im = im.crop((int(x), int(y), int(width), int(height))) im.save(captcha_fn) # request anti-captcha service to decode the captcha api_key = 'XXXXXXXXXXXXXXXXXXXXXXXXXX' # api key -> https://anti-captcha.com/ captcha_fp = open(captcha_fn, 'rb') client = AnticaptchaClient(api_key) task = ImageToTextTask(captcha_fp) job = client.createTask(task) job.join() return job.get_captcha_text()start_url = "YOU KNOW THE URL"driver = webdriver.Chrome()driver.get(start_url)captcha = get_captcha()print( captcha )
Output:
ifds
captcha.png
Notes:
- Use it at your own responsibility (be smart);
- You can improve the code by handling exceptions properly;
anticaptcha
is a paid service (0.5$/1000 imgs);- I'm not affiliated with
anticaptcha
.