How to extract information between two unique words in a large text file

You can use regular expressions for that.

>>> st = "alpha here is my text bravo">>> import re>>> re.findall(r'alpha(.*?)bravo',st)[' here is my text ']

My test.txt file

alpha here is my lineyipeebravo

Now using open to read the file and than applying regular expressions.

>>> f = open('test.txt','r')>>> data = f.read()>>> x = re.findall(r'alpha(.*?)bravo',data,re.DOTALL)>>> x[' here is my line\nyipee\n']>>> "".join(x).replace('\n',' ')' here is my line yipee '>>>

python parsing search text batch-file

a = 'alpha'b = 'bravo'text = 'from alpha all the way to bravo and beyond.'text.split(a)[-1].split(b)[0]# ' all the way to '

python parsing search text batch-file

str.find and its sibling rfind have start and end args.

alpha = 'qawsed'bravo = 'azsxdc'startpos = text.find(alpha) + len(alpha)endpos = text.find(bravo, startpos)do_something_with(text[startpos:endpos]

This is the fastest way if the contained text is short and near the front.

If the contained text is relatively large, use:

startpos = text.find(alpha) + len(alpha)endpos = text.rfind(bravo)

If the contained text is short and near the end, use:

endpos = text.rfind(bravo)startpos = text.rfind(alpha, 0, endpos - len(alpha)) + len(alpha)

The first method is in any case better than the naive method of starting the second search from the start of the text; use it if your contained text has no dominant pattern.

CodeHunter

How to extract information between two unique words in a large text file

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last