Slicing a file in Python Slicing a file in Python arrays arrays

Slicing a file in Python


  1. Converting one object to a mutable object does incur data copying. You can directly read the file to a bytearray by using:

    f = open(FILENAME, 'rb')data = bytearray(os.path.getsize(FILENAME))f.readinto(data)

from http://eli.thegreenplace.net/2011/11/28/less-copies-in-python-with-the-buffer-protocol-and-memoryviews#id12

  1. There is a string to bytearray conversion, so there is potential performance issue.

  2. bytearray is an array, so it can hit the limit of PY_SSIZE_T_MAX/sizeof(PyObject*). For more info, you can visitHow Big can a Python Array Get?


You could do this little hack.

import mmapclass memmap(mmap.mmap):    def read_byte(self):        return ord(super(memmap,self).read_byte())

Create a class that inherits from the mmap class and overwrites the default read_byte that returns a string of length 1 to one that returns a int. And then you could use this class as any other mmap class.

I hope this helps.