What could affect Python string comparison performance for strings over 64 characters?

python string performance time-complexity

Python can 'intern' short strings; stores them in a special cache, and re-uses string objects from that cache.

When then comparing strings, it'll first test if it is the same pointer (e.g. an interned string):

if (a == b) {    switch (op) {    case Py_EQ:case Py_LE:case Py_GE:        result = Py_True;        goto out;// ...

Only if that pointer comparison fails does it use a size check and memcmp to compare the strings.

Interning normally only takes place for identifiers (function names, arguments, attributes, etc.) however, not for string values created at runtime.

Another possible culprit is string constants; string literals used in code are stored as constants at compile time and reused throughout; again only one object is created and identity tests are faster on those.

For string objects that are not the same, Python tests for equal length, equal first characters then uses the memcmp() function on the internal C strings. If your strings are not interned or otherwise are reusing the same objects, all other speed characteristics come down to the memcmp() function.

python string performance time-complexity

I am just making wild guesses but you asked "what might" rather than what does so here are some possibilities:

The CPU cache line size is 64 bytes and longer strings cause a cache miss.
Python might store strings of 64 bytes in one kind of structure and longer strings in a more complicated structure.
Related to the last one: it might zero-pad strings into a 64-byte array and is able to use very fast SSE2 vector instructions to match two strings.

CodeHunter

What could affect Python string comparison performance for strings over 64 characters?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last