Unzipping directory structure with python Unzipping directory structure with python python python

Unzipping directory structure with python


The extract and extractall methods are great if you're on Python 2.6. I have to use Python 2.5 for now, so I just need to create the directories if they don't exist. You can get a listing of directories with the namelist() method. The directories will always end with a forward slash (even on Windows) e.g.,

import os, zipfilez = zipfile.ZipFile('myfile.zip')for f in z.namelist():    if f.endswith('/'):        os.makedirs(f)

You probably don't want to do it exactly like that (i.e., you'd probably want to extract the contents of the zip file as you iterate over the namelist), but you get the idea.


Don't trust extract() or extractall().

These methods blindly extract files to the paths given in their filenames. But ZIP filenames can be anything at all, including dangerous strings like “x/../../../etc/passwd”. Extract such files and you could have just compromised your entire server.

Maybe this should be considered a reportable security hole in Python's zipfile module, but any number of zip-dearchivers have exhibited the exact same behaviour in the past. To unarchive a ZIP file with folder structure safely you need in-depth checking of each file path.


I tried this out, and can reproduce it. The extractall method, as suggested by other answers, does not solve the problem. This seems like a bug in the zipfile module to me (perhaps Windows-only?), unless I'm misunderstanding how zipfiles are structured.

testa\testa\testb\testa\testb\test.log> test.zip>>> from zipfile import ZipFile>>> zipTest = ZipFile("C:\\...\\test.zip")>>> zipTest.extractall("C:\\...\\")Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "...\zipfile.py", line 940, in extractall  File "...\zipfile.py", line 928, in extract  File "...\zipfile.py", line 965, in _extract_memberIOError: [Errno 2] No such file or directory: 'C:\\...\\testa\\testb\\test.log'

If I do a printdir(), I get this (first column):

>>> zipTest.printdir()File Nametesta/testb/testa/testb/test.log

If I try to extract just the first entry, like this:

>>> zipTest.extract("testa/testb/")'C:\\...\\testa\\testb'

On disk, this results in the creation of a folder testa, with a file testb inside. This is apparently the reason why the subsequent attempt to extract test.log fails; testa\testb is a file, not a folder.

Edit #1: If you extract just the file, then it works:

>>> zipTest.extract("testa/testb/test.log")'C:\\...\\testa\\testb\\test.log'

Edit #2: Jeff's code is the way to go; iterate through namelist; if it's a directory, create the directory. Otherwise, extract the file.