You need to read the file in chunks of suitable size:

def md5_for_file(f, block_size=2**20):    md5 = hashlib.md5()    while True:        data =        if not data:            break        md5.update(data)    return md5.digest()

NOTE: Make sure you open your file with the 'rb' to the open - otherwise you will get the wrong result.

So to do the whole lot in one method - use something like:

def generate_file_md5(rootdir, filename, blocksize=2**20):    m = hashlib.md5()    with open( os.path.join(rootdir, filename) , "rb" ) as f:        while True:            buf =            if not buf:                break            m.update( buf )    return m.hexdigest()

The update above was based on the comments provided by Frerich Raabe - and I tested this and found it to be correct on my Python 2.7.2 windows installation

I cross-checked the results using the 'jacksum' tool.

jacksum -a md5 <filename>

Break the file into 8192-byte chunks (or some other multiple of 128 bytes) and feed them to MD5 consecutively using update().

This takes advantage of the fact that MD5 has 128-byte digest blocks (8192 is 128×64). Since you're not reading the entire file into memory, this won't use much more than 8192 bytes of memory.

In Python 3.8+ you can do

import hashlibwith open("your_filename.txt", "rb") as f:    file_hash = hashlib.md5()    while chunk :=        file_hash.update(chunk)print(file_hash.digest())print(file_hash.hexdigest())  # to get a printable str instead of bytes

Python < 3.7

import hashlibdef checksum(filename, hash_factory=hashlib.md5, chunk_num_blocks=128):    h = hash_factory()    with open(filename,'rb') as f:         for chunk in iter(lambda:*h.block_size), b''):             h.update(chunk)    return h.digest()

Python 3.8 and above

import hashlibdef checksum(filename, hash_factory=hashlib.md5, chunk_num_blocks=128):    h = hash_factory()    with open(filename,'rb') as f:         while chunk :=*h.block_size):             h.update(chunk)    return h.digest()

If you want a more Pythonic (no while True) way of reading the file check this code:

import hashlibdef checksum_md5(filename):    md5 = hashlib.md5()    with open(filename,'rb') as f:         for chunk in iter(lambda:, b''):             md5.update(chunk)    return md5.digest()

Note that the iter() function needs an empty byte string for the returned iterator to halt at EOF, since read() returns b'' (not just '').