Is there any nicer way to write successive "or" statements in Python?
A way could be
if any(s in mystring for s in ('foo', 'bar', 'hello')): pass
The thing you iterate over is a tuple, which is built upon compilation of the function, so it shouldn't be inferior to your original version.
If you fear that the tuple will become too long, you could do
def mystringlist(): yield 'foo' yield 'bar' yield 'hello'if any(s in mystring for s in mystringlist()): pass
This sounds like a job for a regex.
import reif re.search("(foo|bar|hello)", mystring): # Do something pass
It should be faster, too. Especially if you compile the regex ahead of time.
If you're generating the regular expression automatically, you could use re.escape()
to make sure no special characters break your regex. For example, if words
is a list of strings you wish to search for, you could generate your pattern like this:
pattern = "(%s)" % ("|".join(re.escape(word) for word in words), )
You should also note that if you have m
words and your string has n
characters, your original code has O(n*m)
complexity, while the regular expression has O(n)
complexity. Even though Python regexs are not really theoretical comp-sci regular expressions, and are not always O(n)
complexity, in this simple case they are.
Since you are processing word-by-word against mystring
, surely mystring can be used as a set. Then just take the intersection between the set containing the words in mystring
and the target groups of words:
In [370]: mystring=set(['foobar','barfoo','foo'])In [371]: mystring.intersection(set(['foo', 'bar', 'hello']))Out[371]: set(['foo'])
Your logical 'or' is the members of the intersection of the two sets.
Using a set is also faster. Here are relative timing vs a generator and regular expression:
f1: generator to test against large string f2: re to test against large string f3: set intersection of two sets of words rate/sec f2 f1 f3f2 101,333 -- -95.0% -95.5%f1 2,026,329 1899.7% -- -10.1%f3 2,253,539 2123.9% 11.2% --
So a generator and the in
operation is 19x faster than a regular expression and a set intersection is 21x faster than a regex and 11% faster than a generator.
Here is the code that generated the timing:
import rewith open('/usr/share/dict/words','r') as fin: set_words={word.strip() for word in fin}s_words=' '.join(set_words)target=set(['bar','foo','hello'])target_re = re.compile("(%s)" % ("|".join(re.escape(word) for word in target), ))gen_target=(word for word in ('bar','foo','hello'))def f1(): """ generator to test against large string """ if any(s in s_words for s in gen_target): return Truedef f2(): """ re to test against large string """ if re.search(target_re, s_words): return Truedef f3(): """ set intersection of two sets of words """ if target.intersection(set_words): return Truefuncs=[f1,f2,f3]legend(funcs)cmpthese(funcs)