fast way to remove lowercase substrings from string? fast way to remove lowercase substrings from string? numpy numpy

fast way to remove lowercase substrings from string?


Python3.x answer:

You can make a string translation table. Once that translation table has been created, you can use it repeatedly:

>>> import string>>> table = str.maketrans('', '', string.ascii_lowercase)>>> s = 'FOObarFOOObBAR'>>> s.translate(table)'FOOFOOOBAR'

When used this way, the first argument values map to the second argument values (where present). If absent, it is assumed to be an identity mapping. The third argument is the collection of values to be removed.


Old python2.x answer for anyone who cares:

I'd use str.translate. Only the delete step is performed if you pass None for the translation table. In this case, I pass the ascii_lowercase as the letters to be deleted.

>>> import string>>> s = 'FOObarFOOObBAR'>>> s.translate(None, string.ascii_lowercase)'FOOFOOOBAR'

I doubt you'll find a faster way, but there's always timeit to compare different options if someone is motivated :).


My first approach would be ''.join(x for x in s if not x.islower())

If you need speed use mgilson answer, it is a lot faster.

>>> timeit.timeit("''.join(x for x in 'FOOBarBaz' if not x.islower())")3.318969964981079>>> timeit.timeit("'FOOBarBaz'.translate(None, string.ascii_lowercase)", "import string")0.5369198322296143>>> timeit.timeit("re.sub('[a-z]', '', 'FOOBarBaz')", "import re")3.631659984588623>>> timeit.timeit("r.sub('', 'FOOBarBaz')", "import re; r = re.compile('[a-z]')")1.9642360210418701>>> timeit.timeit("''.join(x for x in 'FOOBarBaz' if x not in lowercase)", "lowercase = set('abcdefghijklmnopqrstuvwxyz')")2.9605889320373535


import reremove_lower = lambda text: re.sub('[a-z]', '', text)s = "FOObarFOOObBAR"s = remove_lower(s)print(s)