Reading binary file and looping over each byte Reading binary file and looping over each byte python python

Reading binary file and looping over each byte


Python 2.4 and Earlier

f = open("myfile", "rb")try:    byte = f.read(1)    while byte != "":        # Do stuff with byte.        byte = f.read(1)finally:    f.close()

Python 2.5-2.7

with open("myfile", "rb") as f:    byte = f.read(1)    while byte != "":        # Do stuff with byte.        byte = f.read(1)

Note that the with statement is not available in versions of Python below 2.5. To use it in v 2.5 you'll need to import it:

from __future__ import with_statement

In 2.6 this is not needed.

Python 3

In Python 3, it's a bit different. We will no longer get raw characters from the stream in byte mode but byte objects, thus we need to alter the condition:

with open("myfile", "rb") as f:    byte = f.read(1)    while byte != b"":        # Do stuff with byte.        byte = f.read(1)

Or as benhoyt says, skip the not equal and take advantage of the fact that b"" evaluates to false. This makes the code compatible between 2.6 and 3.x without any changes. It would also save you from changing the condition if you go from byte mode to text or the reverse.

with open("myfile", "rb") as f:    byte = f.read(1)    while byte:        # Do stuff with byte.        byte = f.read(1)

python 3.8

From now on thanks to := operator the above code can be written in a shorter way.

with open("myfile", "rb") as f:    while (byte := f.read(1)):        # Do stuff with byte.


This generator yields bytes from a file, reading the file in chunks:

def bytes_from_file(filename, chunksize=8192):    with open(filename, "rb") as f:        while True:            chunk = f.read(chunksize)            if chunk:                for b in chunk:                    yield b            else:                break# example:for b in bytes_from_file('filename'):    do_stuff_with(b)

See the Python documentation for information on iterators and generators.


If the file is not too big that holding it in memory is a problem:

with open("filename", "rb") as f:    bytes_read = f.read()for b in bytes_read:    process_byte(b)

where process_byte represents some operation you want to perform on the passed-in byte.

If you want to process a chunk at a time:

with open("filename", "rb") as f:    bytes_read = f.read(CHUNKSIZE)    while bytes_read:        for b in bytes_read:            process_byte(b)        bytes_read = f.read(CHUNKSIZE)

The with statement is available in Python 2.5 and greater.