Efficiently insert multiple elements in a list (or another data structure) keeping their order Efficiently insert multiple elements in a list (or another data structure) keeping their order python-3.x python-3.x

Efficiently insert multiple elements in a list (or another data structure) keeping their order


Here's python code for a treap with a size decoration that allows insertion at specific indexes, and reordering of whole contiguous sections. It was adapted from C++ code, Kimiyuki Onaka's solution to the Hackerrank problem, "Give Me the Order." (I cannot guarantee that this adaptation is bug free -- a copy of the original code is available in the description of this question.)

import randomclass Treap:  def __init__(self, value=None):    self.value = value    self.key = random.random()    self.size = 1    self.left = None    self.right = Nonedef size(t):  return t.size if t else 0def update(t):  if t:    t.size = 1 + size(t.left) + size(t.right)  return tdef merge(a, b):  if not a:    return b  if not b:    return a  if a.key > b.key:    a.right = merge(a.right, b)    return update(a)  else:    b.left = merge(a, b.left)    return update(b)def split(t, i):  if not t:    return None, None  if i <= size(t.left):    u, t.left = split(t.left, i)    return u, update(t)  else:    t.right, u = split(t.right, i - size(t.left) - 1)    return update(t), udef insert(t, i, value):  left, right = split(t, i)  u = Treap(value)  return merge(merge(left, u), right)def inorder(treap):  if not treap:    return  if treap.left:    inorder(treap.left)  print(treap.value)  if treap.right:    inorder(treap.right)

Output:

lst = ['itemX', 'itemY', 'itemZ']idxs = [0, 0, 1]t = Nonefor i in range(len(lst)):  t = insert(t, idxs[i], lst[i])inorder(t)"""itemYitemZitemX"""


You could use SortedList, neutralize its sorting with a constant key function, and only use it for its fast insertions. Needs version 1.5.10 or older, as insert got removed.

def insertions(indexes, items):    tmp = SortedList(key=lambda _: 0)    for index, item in zip(indexes, items):        tmp.insert(index, item)    return list(tmp)

(I imagine there's also something like it but without sorting that needs to be neutralized, sortedcontainers is just something I know.)

Benchmark results:

               indexes =   [0] * 10**6     [randint(0, i) for i in range(10**6)]--------------------------------------------------------------------------------original                  1540 seconds     759 secondsneutralized SortedList      13 seconds      31 secondssorted mediants            201 seconds     249 secondssorted mediants optimized   42 seconds      72 seconds

Those last two solutions are another idea:

Use a SortedList the normal way, but annotate each item with a fraction from 0 to 1 (and order by that). To insert between two items, use those items' mediant.

from sortedcontainers import SortedListfrom fractions import Fractiondef insertions(indexes, items):    xs = SortedList([(Fraction(0), None), (Fraction(1), None)])    for index, item in zip(indexes, items):        a, c = xs[index][0].as_integer_ratio()        b, d = xs[index + 1][0].as_integer_ratio()        xs.add((Fraction(a+b, c+d), item))    return [item for _, item in xs[1:-1]]

Optimized version doing fractions myself:

from sortedcontainers import SortedListclass X(tuple):    def __lt__(self, other):        return self[0] * other[1] < self[1] * other[0]def insertions(indexes, items):    xs = SortedList([X((0, 1, None)), X((1, 1, None))])    for index, item in zip(indexes, items):        L, R = xs[index : index+2]        xs.add(X((L[0] + R[0], L[1] + R[1], item)))    return [x[2] for x in xs[1:-1]]