Finding patterns in list Finding patterns in list python python

Finding patterns in list


The Code (updated for Python 2 + 3)

Ignoring the "no overlapping" requirement, here's the code I used:

import collections  def pattern(seq):        storage = {}        for length in range(1,int(len(seq)/2)+1):                valid_strings = {}                for start in range(0,len(seq)-length+1):                        valid_strings[start] = tuple(seq[start:start+length])                candidates = set(valid_strings.values())                if len(candidates) != len(valid_strings):                        print("Pattern found for " + str(length))                        storage = valid_strings                else:                        print("No pattern found for " + str(length))                        break        return set(v for v in storage.values() if list(storage.values()).count(v) > 1)

Using that, I found 8 distinct patterns of length 303 in your dataset. The program ran pretty fast, too.

Pseudocode Version

define patterns(sequence):    list_of_substrings = {}    for each valid length:  ### i.e. lengths from 1 to half the list's length        generate a dictionary my_dict of all sub-lists of size length        if there are repeats:            list_of_substrings = my_dict        else:            return all repeated values in list_of_substrings    return list_of_substrings  #### returns {} when there are no patterns


I have an answer.It works.(without overlapping) but it is for python3

      def get_pattern(seq):        seq2=seq        outs={}        l=0        r=0        c=None        for end in range(len(seq2)+1):          for start in range(end):            word=chr(1).join(seq2[start:end])            if not word in outs:              outs[word]=1            else:              outs[word]+=1        for item in outs:          if outs[item]>r or (len(item)>l and outs[item]>1):            l=len(item)            r=outs[item]            c=item        return c.split(chr(1))