Splitting list based on missing numbers in a sequence Splitting list based on missing numbers in a sequence python python

Splitting list based on missing numbers in a sequence


Python 3 version of the code from the old Python documentation:

>>> # Find runs of consecutive numbers using groupby.  The key to the solution>>> # is differencing with a range so that consecutive numbers all appear in>>> # same group.>>> from itertools import groupby>>> from operator import itemgetter>>> data = [ 1,  4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]>>> for k, g in groupby(enumerate(data), lambda i_x: i_x[0] - i_x[1]):...     print(list(map(itemgetter(1), g)))...[1][4, 5, 6][10][15, 16, 17, 18][22][25, 26, 27, 28]

The groupby function from the itertools module generates a break every time the key function changes its return value. The trick is that the return value is the number in the list minus the position of the element in the list. This difference changes when there is a gap in the numbers.

The itemgetter function is from the operator module, you'll have to import this and the itertools module for this example to work.

Alternatively, as a list comprehension:

>>> [map(itemgetter(1), g) for k, g in groupby(enumerate(seq2), lambda i_x: i_x[0] - i_x[1])][[1, 2], [4, 5, 6], [8, 9, 10]]


This is a solution that works in Python 3 (based on previous answers that work in python 2 only).

>>> from operator import itemgetter>>> from itertools import *>>> groups = []>>> for k, g in groupby(enumerate(seq2), lambda x: x[0]-x[1]):>>>     groups.append(list(map(itemgetter(1), g)))... >>> print(groups)[[1, 2], [4, 5, 6], [8, 9, 10]]

or as a list comprehension

>>> [list(map(itemgetter(1), g)) for k, g in groupby(enumerate(seq2), lambda x: x[0]-x[1])][[1, 2], [4, 5, 6], [8, 9, 10]]

Changes were needed because

  • Removal of tuple parameter unpacking PEP 3113
  • map returning an iterator instead of a list


Another option which doesn't need itertools etc.:

>>> data = [1, 4, 5, 6, 10, 15, 16, 17, 18, 22, 25, 26, 27, 28]>>> spl = [0]+[i for i in range(1,len(data)) if data[i]-data[i-1]>1]+[None]>>> [data[b:e] for (b, e) in [(spl[i-1],spl[i]) for i in range(1,len(spl))]]... [[1], [4, 5, 6], [10], [15, 16, 17, 18], [22], [25, 26, 27, 28]]