Quicker to os.walk or glob?

python traversal glob os.walk directory-walk

I made a research on a small cache of web pages in 1000 dirs. The task was to count a total number of files in dirs. The output is:

os.listdir: 0.7268s, 1326786 files foundos.walk: 3.6592s, 1326787 files foundglob.glob: 2.0133s, 1326786 files found

As you see, os.listdir is quickest of three. And glog.glob is still quicker than os.walk for this task.

The source:

import os, time, globn, t = 0, time.time()for i in range(1000):    n += len(os.listdir("./%d" % i))t = time.time() - tprint "os.listdir: %.4fs, %d files found" % (t, n)n, t = 0, time.time()for root, dirs, files in os.walk("./"):    for file in files:        n += 1t = time.time() - tprint "os.walk: %.4fs, %d files found" % (t, n)n, t = 0, time.time()for i in range(1000):    n += len(glob.glob("./%d/*" % i))t = time.time() - tprint "glob.glob: %.4fs, %d files found" % (t, n)

python traversal glob os.walk directory-walk

Don't waste your time for optimization before measuring/profiling. Focus on making your code simple and easy to maintain.

For example, in your code you precompile RE, which does not give you any speed boost, because re module has internal re._cache of precompiled REs.

Keep it simple
if it's slow, then profile
once you know exactly what needs to be optimized do some tweaks and always document it

Note, that some optimization done several years prior can make code run slower compared to "non-optimized" code. This applies especially for modern JIT based languages.

python traversal glob os.walk directory-walk

You can use os.walk and still use glob-style matching.

for root, dirs, files in os.walk(DIRECTORY):    for file in files:        if glob.fnmatch.fnmatch(file, PATTERN):            print file

Not sure about speed, but obviously since os.walk is recursive, they do different things.

CodeHunter

Quicker to os.walk or glob?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last