Skip to main content

Why is os.sep insufficient for path operations?

I was reading the documentation for Python’s os module recently, and a sentence caught my eye:

The character used by the operating system to separate pathname components. This is '/' for POSIX and '\\\\' for Windows. Note that knowing this is not sufficient to be able to parse or concatenate pathnames — use os.path.split() and os.path.join() — but it is occasionally useful.

I was naturally curious: when is os.sep not sufficient?

I decided to have a peek at the implementation of os.path.split() and os.path.join() – how are they more complicated than a simple str.split() and str.join()?

First, it’s worth understanding how os.path works.

Because paths behave differently on different platforms, there are multiple modules for manipulating paths in the standard library – posixpath for UNIX-style paths, and ntpath for Windows-style paths. Older versions of Python included modules like macpath for old MacOS-style paths and os2emxpath for OS/2 EMX paths.

When you use os.path, it selects the appropriate module for your platform. Using os.path means your code should do the right thing, even if the code is run on a different platform to where you wrote it. If you do want a specific style of path, you can import the specific module, which is what I’ll do in my examples.

I looked at four files to find out what these functions do:

Even without reading the code, the length of the functions and their tests give a clue as to how complicated they are.

Here are a few of my favourite examples where paths are trickier than simple strings:

There’s also a lot of code for Windows drive letters and UNC paths, two concepts I’m totally unfamiliar with.

I don’t pretend to have a complete understanding of how paths work after this quick skim – but I did enjoy digging into it, and now I feel I have a better understanding of why I shouldn’t use os.sep to do my own path operations.