Python urllib2 Response header Python urllib2 Response header python python

Python urllib2 Response header


Try to request as Firefox does. You can see the request headers in Firebug, so add them to your request object:

import urllib2request = urllib2.Request('http://your.tld/...')request.add_header('User-Agent', 'some fake agent string')request.add_header('Referer', 'fake referrer')...response = urllib2.urlopen(request)# check content type:print response.info().getheader('Content-Type')

There's also HTTPCookieProcessor which can make it better, but I don't think you'll need it in most cases. Have a look at python's documentation:

http://docs.python.org/library/urllib2.html


Content-Type text/html

Really, like that, without the colon?

If so, that might explain it: it's an invalid header, so it gets ignored, so urllib guesses the content-type instead, by looking at the filename. If the URL happens to have ‘.flv’ at the end, it'll guess the type should be video/x-flv.


This peculiar discrepancy might be explained by different headers (maybe ones of the accept kind) being sent by the two requests -- can you check that...? Or, if Javascript is running in Firefox (which I assume you're using when you're running firebug?) -- since it's definitely NOT running in the Python case -- "all bets are off", as they say;-).