Best way to determine if a sequence is in another sequence? Best way to determine if a sequence is in another sequence? python python

Best way to determine if a sequence is in another sequence?


I second the Knuth-Morris-Pratt algorithm. By the way, your problem (and the KMP solution) is exactly recipe 5.13 in Python Cookbook 2nd edition. You can find the related code at http://code.activestate.com/recipes/117214/

It finds all the correct subsequences in a given sequence, and should be used as an iterator:

>>> for s in KnuthMorrisPratt([4,'a',3,5,6], [5,6]): print s3>>> for s in KnuthMorrisPratt([4,'a',3,5,6], [5,7]): print s(nothing)


Here's a brute-force approach O(n*m) (similar to @mcella's answer). It might be faster than the Knuth-Morris-Pratt algorithm implementation in pure Python O(n+m) (see @Gregg Lind answer) for small input sequences.

#!/usr/bin/env pythondef index(subseq, seq):    """Return an index of `subseq`uence in the `seq`uence.    Or `-1` if `subseq` is not a subsequence of the `seq`.    The time complexity of the algorithm is O(n*m), where        n, m = len(seq), len(subseq)    >>> index([1,2], range(5))    1    >>> index(range(1, 6), range(5))    -1    >>> index(range(5), range(5))    0    >>> index([1,2], [0, 1, 0, 1, 2])    3    """    i, n, m = -1, len(seq), len(subseq)    try:        while True:            i = seq.index(subseq[0], i + 1, n - m + 1)            if subseq == seq[i:i + m]:               return i    except ValueError:        return -1if __name__ == '__main__':    import doctest; doctest.testmod()

I wonder how large is the small in this case?


A simple approach: Convert to strings and rely on string matching.

Example using lists of strings:

 >>> f = ["foo", "bar", "baz"] >>> g = ["foo", "bar"] >>> ff = str(f).strip("[]") >>> gg = str(g).strip("[]") >>> gg in ff True

Example using tuples of strings:

>>> x = ("foo", "bar", "baz")>>> y = ("bar", "baz")>>> xx = str(x).strip("()")>>> yy = str(y).strip("()")>>> yy in xxTrue

Example using lists of numbers:

>>> f = [1 , 2, 3, 4, 5, 6, 7]>>> g = [4, 5, 6]>>> ff = str(f).strip("[]")>>> gg = str(g).strip("[]")>>> gg in ffTrue