How to use Python requests to fake a browser visit a.k.a and generate User Agent? How to use Python requests to fake a browser visit a.k.a and generate User Agent? python python

How to use Python requests to fake a browser visit a.k.a and generate User Agent?


Provide a User-Agent header:

import requestsurl = 'http://www.ichangtou.com/#company:data_000008.html'headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}response = requests.get(url, headers=headers)print(response.content)

FYI, here is a list of User-Agent strings for different browsers:


As a side note, there is a pretty useful third-party package called fake-useragent that provides a nice abstraction layer over user agents:

fake-useragent

Up to date simple useragent faker with real world database

Demo:

>>> from fake_useragent import UserAgent>>> ua = UserAgent()>>> ua.chromeu'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1667.0 Safari/537.36'>>> ua.randomu'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36'


I used fake UserAgent.

How to use:

from fake_useragent import UserAgentimport requests   ua = UserAgent()print(ua.chrome)header = {'User-Agent':str(ua.chrome)}print(header)url = "https://www.hybrid-analysis.com/recent-submissions?filter=file&sort=^timestamp"htmlContent = requests.get(url, headers=header)print(htmlContent)

Output:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1309.0 Safari/537.17{'User-Agent': 'Mozilla/5.0 (X11; OpenBSD i386) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36'}<Response [200]>


Try doing this, using firefox as fake user agent (moreover, it's a good startup script for web scraping with the use of cookies):

#!/usr/bin/env python2# -*- coding: utf8 -*-# vim:ts=4:sw=4import cookielib, urllib2, sysdef doIt(uri):    cj = cookielib.CookieJar()    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))    page = opener.open(uri)    page.addheaders = [('User-agent', 'Mozilla/5.0')]    print page.read()for i in sys.argv[1:]:    doIt(i)

USAGE:

python script.py "http://www.ichangtou.com/#company:data_000008.html"