Is there a convenient way to map a file uri to os.path? Is there a convenient way to map a file uri to os.path? python python

Is there a convenient way to map a file uri to os.path?


Use urllib.parse.urlparse to get the path from the URI:

import osfrom urllib.parse import urlparsep = urlparse('file://C:/test/doc.txt')final_path = os.path.abspath(os.path.join(p.netloc, p.path))


The solution from @Jakob Bowyer doesn't convert URL encoded characters to regular UTF-8 characters. For that you need to use urllib.parse.unquote.

>>> from urllib.parse import unquote, urlparse>>> unquote(urlparse('file:///home/user/some%20file.txt').path)'/home/user/some file.txt'


Of all the answers so far, I found none that catch edge cases, doesn't require branching, are both 2/3 compatible, and cross-platform.

In short, this does the job, using only builtins:

try:    from urllib.parse import urlparse, unquote    from urllib.request import url2pathnameexcept ImportError:    # backwards compatability    from urlparse import urlparse    from urllib import unquote, url2pathnamedef uri_to_path(uri):    parsed = urlparse(uri)    host = "{0}{0}{mnt}{0}".format(os.path.sep, mnt=parsed.netloc)    return os.path.normpath(        os.path.join(host, url2pathname(unquote(parsed.path)))    )

The tricky bit (I found) was when working in Windows with paths specifying a host. This is a non-issue outside of Windows: network locations in *NIX can only be reached via paths after being mounted to the root of the filesystem.

From Wikipedia: A file URI takes the form of file://host/path , where host is the fully qualified domain name of the system on which the path is accessible [...]. If host is omitted, it is taken to be "localhost".

With that in mind, I make it a rule to ALWAYS prefix the path with the netloc provided by urlparse, before passing it to os.path.abspath, which is necessary as it removes any resulting redundant slashes (os.path.normpath, which also claims to fix the slashes, can get a little over-zealous in Windows, hence the use of abspath).

The other crucial component in the conversion is using unquote to escape/decode the URL percent-encoding, which your filesystem won't otherwise understand. Again, this might be a bigger issue on Windows, which allows things like $ and spaces in paths, which will have been encoded in the file URI.

For a demo:

import osfrom pathlib import Path   # This demo requires pip install for Python < 3.4import systry:    from urllib.parse import urlparse, unquote    from urllib.request import url2pathnameexcept ImportError:  # backwards compatability:    from urlparse import urlparse    from urllib import unquote, url2pathnameDIVIDER = "-" * 30if sys.platform == "win32":  # WINDOWS    filepaths = [        r"C:\Python27\Scripts\pip.exe",        r"C:\yikes\paths with spaces.txt",        r"\\localhost\c$\WINDOWS\clock.avi",        r"\\networkstorage\homes\rdekleer",    ]else:  # *NIX    filepaths = [        os.path.expanduser("~/.profile"),        "/usr/share/python3/py3versions.py",    ]for path in filepaths:    uri = Path(path).as_uri()    parsed = urlparse(uri)    host = "{0}{0}{mnt}{0}".format(os.path.sep, mnt=parsed.netloc)    normpath = os.path.normpath(        os.path.join(host, url2pathname(unquote(parsed.path)))    )    absolutized = os.path.abspath(        os.path.join(host, url2pathname(unquote(parsed.path)))    )    result = ("{DIVIDER}"              "\norig path:       \t{path}"              "\nconverted to URI:\t{uri}"              "\nrebuilt normpath:\t{normpath}"              "\nrebuilt abspath:\t{absolutized}").format(**locals())    print(result)    assert path == absolutized

Results (WINDOWS):

------------------------------orig path:              C:\Python27\Scripts\pip.execonverted to URI:       file:///C:/Python27/Scripts/pip.exerebuilt normpath:       C:\Python27\Scripts\pip.exerebuilt abspath:        C:\Python27\Scripts\pip.exe------------------------------orig path:              C:\yikes\paths with spaces.txtconverted to URI:       file:///C:/yikes/paths%20with%20spaces.txtrebuilt normpath:       C:\yikes\paths with spaces.txtrebuilt abspath:        C:\yikes\paths with spaces.txt------------------------------orig path:              \\localhost\c$\WINDOWS\clock.aviconverted to URI:       file://localhost/c%24/WINDOWS/clock.avirebuilt normpath:       \localhost\c$\WINDOWS\clock.avirebuilt abspath:        \\localhost\c$\WINDOWS\clock.avi------------------------------orig path:              \\networkstorage\homes\rdekleerconverted to URI:       file://networkstorage/homes/rdekleerrebuilt normpath:       \networkstorage\homes\rdekleerrebuilt abspath:        \\networkstorage\homes\rdekleer

Results (*NIX):

------------------------------orig path:              /home/rdekleer/.profileconverted to URI:       file:///home/rdekleer/.profilerebuilt normpath:       /home/rdekleer/.profilerebuilt abspath:        /home/rdekleer/.profile------------------------------orig path:              /usr/share/python3/py3versions.pyconverted to URI:       file:///usr/share/python3/py3versions.pyrebuilt normpath:       /usr/share/python3/py3versions.pyrebuilt abspath:        /usr/share/python3/py3versions.py