Why is startswith slower than slicing

python startswith

Some of the performance difference can be explained by taking into account the time it takes the . operator to do its thing:

>>> x = 'foobar'>>> y = 'foo'>>> sw = x.startswith>>> %timeit x.startswith(y)1000000 loops, best of 3: 316 ns per loop>>> %timeit sw(y)1000000 loops, best of 3: 267 ns per loop>>> %timeit x[:3] == y10000000 loops, best of 3: 151 ns per loop

Another portion of the difference can be explained by the fact that startswith is a function, and even no-op function calls take a bit of time:

>>> def f():...     pass... >>> %timeit f()10000000 loops, best of 3: 105 ns per loop

This does not totally explain the difference, since the version using slicing and len calls a function and is still faster (compare to sw(y) above -- 267 ns):

>>> %timeit x[:len(y)] == y1000000 loops, best of 3: 213 ns per loop

My only guess here is that maybe Python optimizes lookup time for built-in functions, or that len calls are heavily optimized (which is probably true). It might be possible to test that with a custom len func. Or possibly this is where the differences identified by LastCoder kick in. Note also larsmans' results, which indicate that startswith is actually faster for longer strings. The whole line of reasoning above applies only to those cases where the overhead I'm talking about actually matters.

python startswith

The comparison isn't fair since you're only measuring the case where startswith returns True.

>>> x = 'foobar'>>> y = 'fool'>>> %timeit x.startswith(y)1000000 loops, best of 3: 221 ns per loop>>> %timeit x[:3] == y  # note: length mismatch10000000 loops, best of 3: 122 ns per loop>>> %timeit x[:4] == y10000000 loops, best of 3: 158 ns per loop>>> %timeit x[:len(y)] == y1000000 loops, best of 3: 210 ns per loop>>> sw = x.startswith>>> %timeit sw(y)10000000 loops, best of 3: 176 ns per loop

Also, for much longer strings, startswith is a lot faster:

>>> import random>>> import string>>> x = '%030x' % random.randrange(256**10000)>>> len(x)20000>>> y = r[:4000]>>> %timeit x.startswith(y)1000000 loops, best of 3: 211 ns per loop>>> %timeit x[:len(y)] == y1000000 loops, best of 3: 469 ns per loop>>> sw = x.startswith>>> %timeit sw(y)10000000 loops, best of 3: 168 ns per loop

This is still true when there's no match.

# change last character of y>>> y = y[:-1] + chr((ord(y[-1]) + 1) % 256)>>> %timeit x.startswith(y)1000000 loops, best of 3: 210 ns per loop>>> %timeit x[:len(y)] == y1000000 loops, best of 3: 470 ns per loop>>> %timeit sw(y)10000000 loops, best of 3: 168 ns per loop# change first character of y>>> y = chr((ord(y[0]) + 1) % 256) + y[1:]>>> %timeit x.startswith(y)1000000 loops, best of 3: 210 ns per loop>>> %timeit x[:len(y)] == y1000000 loops, best of 3: 442 ns per loop>>> %timeit sw(y)10000000 loops, best of 3: 168 ns per loop

So, startswith is probably slower for short strings because it's optimized for long ones.

(Trick to get random strings taken from this answer.)

python startswith

startswith is more complex than slicing...

2924 result = _string_tailmatch(self,2925 PyTuple_GET_ITEM(subobj, i),2926 start, end, -1);

This isn't a simple character compare loop for needle in beginning of haystack that's happening. We're looking at a for loop that is iterating through a vector/tuple (subobj) and calling another function (_string_tailmatch) on it. Multiple function calls have overhead with regards to the stack, argument sanity checks etc...

startswith is a library function while the slicing appears to be built into the language.

2919 if (!stringlib_parse_args_finds("startswith", args, &subobj, &start, &end))2920 return NULL;

CodeHunter

Why is startswith slower than slicing

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last