Python split() without removing the delimiter [duplicate] Python split() without removing the delimiter [duplicate] python python

Python split() without removing the delimiter [duplicate]


d = ">"for line in all_lines:    s =  [e+d for e in line.split(d) if e]


If you are parsing HTML with splits, you are most likely doing it wrong, except if you are writing a one-shot script aimed at a fixed and secure content file. If it is supposed to work on any HTML input, how will you handle something like <a title='growth > 8%' href='#something'>?

Anyway, the following works for me:

>>> import re>>> re.split('(<[^>]*>)', '<body><table><tr><td>')[1::2]['<body>', '<table>', '<tr>', '<td>']


How about this:

import res = '<html><head>'re.findall('[^>]+>', s)