unzipping file results in "BadZipFile: File is not a zip file" unzipping file results in "BadZipFile: File is not a zip file" python python

unzipping file results in "BadZipFile: File is not a zip file"


files named file can confuse python - try naming it something else. if it STILL wont work, try this code:

def fixBadZipfile(zipFile):   f = open(zipFile, 'r+b')   data = f.read()   pos = data.find('\x50\x4b\x05\x06') # End of central directory signature   if (pos > 0):       self._log("Trancating file at location " + str(pos + 22)+ ".")       f.seek(pos + 22)   # size of 'ZIP end of central directory record'      f.truncate()       f.close()   else:       # raise error, file is truncated  


astronautlevel's solution works for most cases, but the compressed data and CRCs in the Zip can also contain the same 4 bytes. You should do an rfind (not find), seek to pos+20 and then add write \x00\x00 to the end of the file (tell zip applications that the length of the 'comments' section is 0 bytes long).

    # HACK: See http://bugs.python.org/issue10694    # The zip file generated is correct, but because of extra data after the 'central directory' section,    # Some version of python (and some zip applications) can't read the file. By removing the extra data,    # we ensure that all applications can read the zip without issue.    # The ZIP format: http://www.pkware.com/documents/APPNOTE/APPNOTE-6.3.0.TXT    # Finding the end of the central directory:    #   http://stackoverflow.com/questions/8593904/how-to-find-the-position-of-central-directory-in-a-zip-file    #   http://stackoverflow.com/questions/20276105/why-cant-python-execute-a-zip-archive-passed-via-stdin    #       This second link is only losely related, but echos the first, "processing a ZIP archive often requires backwards seeking"    content = zipFileContainer.read()    pos = content.rfind('\x50\x4b\x05\x06') # reverse find: this string of bytes is the end of the zip's central directory.    if pos>0:        zipFileContainer.seek(pos+20) # +20: see secion V.I in 'ZIP format' link above.        zipFileContainer.truncate()        zipFileContainer.write('\x00\x00') # Zip file comment length: 0 byte length; tell zip applications to stop reading.        zipFileContainer.seek(0)    return zipFileContainer


I run into the same issue. My problem was that it was a gzip instead of a zip file. I switched to the class gzip.GzipFile and it worked like a charm.