How to know if urllib.urlretrieve succeeds?
Consider using urllib2
if it possible in your case. It is more advanced and easy to use than urllib
.
You can detect any HTTP errors easily:
>>> import urllib2>>> resp = urllib2.urlopen("http://google.com/abc.jpg")Traceback (most recent call last):<<MANY LINES SKIPPED>>urllib2.HTTPError: HTTP Error 404: Not Found
resp
is actually HTTPResponse
object that you can do a lot of useful things with:
>>> resp = urllib2.urlopen("http://google.com/")>>> resp.code200>>> resp.headers["content-type"]'text/html; charset=windows-1251'>>> resp.read()"<<ACTUAL HTML>>"
I keep it simple:
# Simple downloading with progress indicator, by Cees Timmerman, 16mar12.import urllib2remote = r"http://some.big.file"local = r"c:\downloads\bigfile.dat"u = urllib2.urlopen(remote)h = u.info()totalSize = int(h["Content-Length"])print "Downloading %s bytes..." % totalSize,fp = open(local, 'wb')blockSize = 8192 #100000 # urllib.urlretrieve uses 8192count = 0while True: chunk = u.read(blockSize) if not chunk: break fp.write(chunk) count += 1 if totalSize > 0: percent = int(count * blockSize * 100 / totalSize) if percent > 100: percent = 100 print "%2d%%" % percent, if percent < 100: print "\b\b\b\b\b", # Erase "NN% " else: print "Done."fp.flush()fp.close()if not totalSize: print
According to the documentation is is undocumented
to get access to the message it looks like you do something like:
a, b=urllib.urlretrieve('http://google.com/abc.jpg', r'c:\abc.jpg')
b is the message instance
Since I have learned that Python it is always useful to use Python's ability to be introspective when I type
dir(b)
I see lots of methods or functions to play with
And then I started doing things with b
for example
b.items()
Lists lots of interesting things, I suspect that playing around with these things will allow you to get the attribute you want to manipulate.
Sorry this is such a beginner's answer but I am trying to master how to use the introspection abilities to improve my learning and your questions just popped up.
Well I tried something interesting related to this-I was wondering if I could automatically get the output from each of the things that showed up in the directory that did not need parameters so I wrote:
needparam=[]for each in dir(b): x='b.'+each+'()' try: eval(x) print x except: needparam.append(x)