When should I ever use file.read() or file.readlines()?

python io timeit

The short answer to your question is that each of these three methods of reading bits of a file have different use cases. As noted above, f.read() reads the file as an individual string, and so allows relatively easy file-wide manipulations, such as a file-wide regex search or substitution.

f.readline() reads a single line of the file, allowing the user to parse a single line without necessarily reading the entire file. Using f.readline() also allows easier application of logic in reading the file than a complete line by line iteration, such as when a file changes format partway through.

Using the syntax for line in f: allows the user to iterate over the file line by line as noted in the question.

(As noted in the other answer, this documentation is a very good read):

https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects

Note:It was previously claimed that f.readline() could be used to skip a line during a for loop iteration. However, this doesn't work in Python 2.7, and is perhaps a questionable practice, so this claim has been removed.

python io timeit

Hope this helps!

https://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects

When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory

Sorry for all the edits!

For reading lines from a file, you can loop over the file object. This is memory efficient, fast, and leads to simple code:

for line in f:    print line,This is the first line of the file.Second line of the file

python io timeit

Note that readline() is not comparable to the case of reading all lines in for-loop since it reads line by line and there is an overhead which is pointed out by others already.

I ran timeit on two identical snippts but one with for-loop and the other with readlines(). You can see my snippet below:

  def test_read_file_1():      f = open('ml/README.md', 'r')      for line in f.readlines():          print(line)      def test_read_file_2():      f = open('ml/README.md', 'r')      for line in f:          print(line)      def test_time_read_file():      from timeit import timeit        duration_1 = timeit(lambda: test_read_file_1(), number=1000000)      duration_2 = timeit(lambda: test_read_file_2(), number=1000000)        print('duration using readlines():', duration_1)      print('duration using for-loop:', duration_2)

And the results:

duration using readlines(): 78.826229238duration using for-loop: 69.487692794

The bottomline, I would say, for-loop is faster but in case of possibility of both, I'd rather readlines().

CodeHunter

When should I ever use file.read() or file.readlines()?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last