How to read specific lines from a file (by line number)? How to read specific lines from a file (by line number)? python python

How to read specific lines from a file (by line number)?


If the file to read is big, and you don't want to read the whole file in memory at once:

fp = open("file")for i, line in enumerate(fp):    if i == 25:        # 26th line    elif i == 29:        # 30th line    elif i > 29:        breakfp.close()

Note that i == n-1 for the nth line.


In Python 2.6 or later:

with open("file") as fp:    for i, line in enumerate(fp):        if i == 25:            # 26th line        elif i == 29:            # 30th line        elif i > 29:            break


The quick answer:

f=open('filename')lines=f.readlines()print lines[25]print lines[29]

or:

lines=[25, 29]i=0f=open('filename')for line in f:    if i in lines:        print i    i+=1

There is a more elegant solution for extracting many lines: linecache (courtesy of "python: how to jump to a particular line in a huge text file?", a previous stackoverflow.com question).

Quoting the python documentation linked above:

>>> import linecache>>> linecache.getline('/etc/passwd', 4)'sys:x:3:3:sys:/dev:/bin/sh\n'

Change the 4 to your desired line number, and you're on. Note that 4 would bring the fifth line as the count is zero-based.

If the file might be very large, and cause problems when read into memory, it might be a good idea to take @Alok's advice and use enumerate().

To Conclude:

  • Use fileobject.readlines() or for line in fileobject as a quick solution for small files.
  • Use linecache for a more elegant solution, which will be quite fast for reading many files, possible repeatedly.
  • Take @Alok's advice and use enumerate() for files which could be very large, and won't fit into memory. Note that using this method might slow because the file is read sequentially.


A fast and compact approach could be:

def picklines(thefile, whatlines):  return [x for i, x in enumerate(thefile) if i in whatlines]

this accepts any open file-like object thefile (leaving up to the caller whether it should be opened from a disk file, or via e.g a socket, or other file-like stream) and a set of zero-based line indices whatlines, and returns a list, with low memory footprint and reasonable speed. If the number of lines to be returned is huge, you might prefer a generator:

def yieldlines(thefile, whatlines):  return (x for i, x in enumerate(thefile) if i in whatlines)

which is basically only good for looping upon -- note that the only difference comes from using rounded rather than square parentheses in the return statement, making a list comprehension and a generator expression respectively.

Further note that despite the mention of "lines" and "file" these functions are much, much more general -- they'll work on any iterable, be it an open file or any other, returning a list (or generator) of items based on their progressive item-numbers. So, I'd suggest using more appropriately general names;-).