Why is it string.join(list) instead of list.join(string)? Why is it string.join(list) instead of list.join(string)? python python

Why is it string.join(list) instead of list.join(string)?


It's because any iterable can be joined (e.g, list, tuple, dict, set), but its contents and the "joiner" must be strings.

For example:

'_'.join(['welcome', 'to', 'stack', 'overflow'])'_'.join(('welcome', 'to', 'stack', 'overflow'))
'welcome_to_stack_overflow'

Using something other than strings will raise the following error:

TypeError: sequence item 0: expected str instance, int found


This was discussed in the String methods... finally thread in the Python-Dev achive, and was accepted by Guido. This thread began in Jun 1999, and str.join was included in Python 1.6 which was released in Sep 2000 (and supported Unicode). Python 2.0 (supported str methods including join) was released in Oct 2000.

  • There were four options proposed in this thread:
    • str.join(seq)
    • seq.join(str)
    • seq.reduce(str)
    • join as a built-in function
  • Guido wanted to support not only lists and tuples, but all sequences/iterables.
  • seq.reduce(str) is difficult for newcomers.
  • seq.join(str) introduces unexpected dependency from sequences to str/unicode.
  • join() as a built-in function would support only specific data types. So using a built-in namespace is not good. If join() supports many datatypes, creating an optimized implementation would be difficult, if implemented using the __add__ method then it would ve O(n²).
  • The separator string (sep) should not be omitted. Explicit is better than implicit.

Here are some additional thoughts (my own, and my friend's):

  • Unicode support was coming, but it was not final. At that time UTF-8 was the most likely about to replace UCS2/4. To calculate total buffer length of UTF-8 strings it needs to know character coding rule.
  • At that time, Python had already decided on a common sequence interface rule where a user could create a sequence-like (iterable) class. But Python didn't support extending built-in types until 2.2. At that time it was difficult to provide basic iterable class (which is mentioned in another comment).

Guido's decision is recorded in a historical mail, deciding on str.join(seq):

Funny, but it does seem right! Barry, go for it...
Guido van Rossum


Because the join() method is in the string class, instead of the list class?

I agree it looks funny.

See http://www.faqs.org/docs/diveintopython/odbchelper_join.html:

Historical note. When I first learnedPython, I expected join to be a methodof a list, which would take thedelimiter as an argument. Lots ofpeople feel the same way, and there’sa story behind the join method. Priorto Python 1.6, strings didn’t have allthese useful methods. There was aseparate string module which containedall the string functions; eachfunction took a string as its firstargument. The functions were deemedimportant enough to put onto thestrings themselves, which made sensefor functions like lower, upper, andsplit. But many hard-core Pythonprogrammers objected to the new joinmethod, arguing that it should be amethod of the list instead, or that itshouldn’t move at all but simply staya part of the old string module (whichstill has lots of useful stuff in it).I use the new join method exclusively,but you will see code written eitherway, and if it really bothers you, youcan use the old string.join functioninstead.

--- Mark Pilgrim, Dive into Python