python os.walk to certain level [duplicate] python os.walk to certain level [duplicate] python python

python os.walk to certain level [duplicate]


you could do like this:

depth = 2# [1] abspath() already acts as normpath() to remove trailing os.sep#, and we need ensures trailing os.sep not exists to make slicing accurate. # [2] abspath() also make /../ and ////, "." get resolved even though os.walk can returns it literally.# [3] expanduser() expands ~# [4] expandvars() expands $HOMEstuff = os.path.abspath(os.path.expanduser(os.path.expandvars(stuff)))for root,dirs,files in os.walk(stuff):    if root[len(stuff):].count(os.sep) < depth:        for f in files:            print(os.path.join(root,f))

key is: if root[len(stuff):].count(os.sep) < depth

It removes stuff from root, so result is relative to stuff. Just count the number of files separators.

The depth acts like find command found in Linux, i.e. -maxdepth 0 means do nothing, -maxdepth 1 only scan files in first level, and -maxdepth 2 scan files included sub-directory.

Of course, it still scans the full file structure, but unless it's very deep that'll work.

Another solution would be to only use os.listdir recursively (with directory check) with a maximum recursion level, but that's a little trickier if you don't need it. Since it's not that hard, here's one implementation:

def scanrec(root):    rval = []    def do_scan(start_dir,output,depth=0):        for f in os.listdir(start_dir):            ff = os.path.join(start_dir,f)            if os.path.isdir(ff):                if depth<2:                    do_scan(ff,output,depth+1)            else:                output.append(ff)    do_scan(root,rval,0)    return rvalprint(scanrec(stuff))  # prints the list of files not below 2 deep

Note: os.listdir and os.path.isfile perform 2 stat calls so not optimal. In Python 3.5, the use of os.scandir could avoid that double call.


You can count the separators and if it's two levels deep delete the content of dirs so walk doesn't recurse deeper:

import osMAX_DEPTH = 2folders = ['Y:\\path1', 'Y:\\path2', 'Y:\\path3']for stuff in folders:    for root, dirs, files in os.walk(stuff, topdown=True):        print("there are", len(files), "files in", root)        if root.count(os.sep) - stuff.count(os.sep) == MAX_DEPTH - 1:            del dirs[:]

Python documentation states following about the behavior:

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again.

Note that you need to take into account the the separators present in the folders. For example when y:\path1 is walked root is y:\path but you don't want to stop recursion there.