How to do CamelCase split in python How to do CamelCase split in python python python

How to do CamelCase split in python


As @AplusKminus has explained, re.split() never splits on an empty pattern match. Therefore, instead of splitting, you should try finding the components you are interested in.

Here is a solution using re.finditer() that emulates splitting:

def camel_case_split(identifier):    matches = finditer('.+?(?:(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])|$)', identifier)    return [m.group(0) for m in matches]


Use re.sub() and split()

import rename = 'CamelCaseTest123'splitted = re.sub('([A-Z][a-z]+)', r' \1', re.sub('([A-Z]+)', r' \1', name)).split()

Result

'CamelCaseTest123' -> ['Camel', 'Case', 'Test123']'CamelCaseXYZ' -> ['Camel', 'Case', 'XYZ']'XYZCamelCase' -> ['XYZ', 'Camel', 'Case']'XYZ' -> ['XYZ']'IPAddress' -> ['IP', 'Address']


Most of the time when you don't need to check the format of a string, a global research is more simple than a split (for the same result):

re.findall(r'[A-Z](?:[a-z]+|[A-Z]*(?=[A-Z]|$))', 'CamelCaseXYZ')

returns

['Camel', 'Case', 'XYZ']

To deal with dromedary too, you can use:

re.findall(r'[A-Z]?[a-z]+|[A-Z]+(?=[A-Z]|$)', 'camelCaseXYZ')

Note: (?=[A-Z]|$) can be shorten using a double negation (a negative lookahead with a negated character class): (?![^A-Z])