Python recursive folder read
Make sure you understand the three return values of os.walk
:
for root, subdirs, files in os.walk(rootdir):
has the following meaning:
root
: Current path which is "walked through"subdirs
: Files inroot
of type directoryfiles
: Files inroot
(not insubdirs
) of type other than directory
And please use os.path.join
instead of concatenating with a slash! Your problem is filePath = rootdir + '/' + file
- you must concatenate the currently "walked" folder instead of the topmost folder. So that must be filePath = os.path.join(root, file)
. BTW "file" is a builtin, so you don't normally use it as variable name.
Another problem are your loops, which should be like this, for example:
import osimport syswalk_dir = sys.argv[1]print('walk_dir = ' + walk_dir)# If your current working directory may change during script execution, it's recommended to# immediately convert program arguments to an absolute path. Then the variable root below will# be an absolute path as well. Example:# walk_dir = os.path.abspath(walk_dir)print('walk_dir (absolute) = ' + os.path.abspath(walk_dir))for root, subdirs, files in os.walk(walk_dir): print('--\nroot = ' + root) list_file_path = os.path.join(root, 'my-directory-list.txt') print('list_file_path = ' + list_file_path) with open(list_file_path, 'wb') as list_file: for subdir in subdirs: print('\t- subdirectory ' + subdir) for filename in files: file_path = os.path.join(root, filename) print('\t- file %s (full path: %s)' % (filename, file_path)) with open(file_path, 'rb') as f: f_content = f.read() list_file.write(('The file %s contains:\n' % filename).encode('utf-8')) list_file.write(f_content) list_file.write(b'\n')
If you didn't know, the with
statement for files is a shorthand:
with open('filename', 'rb') as f: dosomething()# is effectively the same asf = open('filename', 'rb')try: dosomething()finally: f.close()
If you are using Python 3.5 or above, you can get this done in 1 line.
import glob# root_dir needs a trailing slash (i.e. /root/dir/)for filename in glob.iglob(root_dir + '**/*.txt', recursive=True): print(filename)
As mentioned in the documentation
If recursive is true, the pattern '**' will match any files and zero or more directories and subdirectories.
If you want every file, you can use
import globfor filename in glob.iglob(root_dir + '**/**', recursive=True): print(filename)
Agree with Dave Webb, os.walk
will yield an item for each directory in the tree. Fact is, you just don't have to care about subFolders
.
Code like this should work:
import osimport sysrootdir = sys.argv[1]for folder, subs, files in os.walk(rootdir): with open(os.path.join(folder, 'python-outfile.txt'), 'w') as dest: for filename in files: with open(os.path.join(folder, filename), 'r') as src: dest.write(src.read())