Python MemoryError: cannot allocate array memory Python MemoryError: cannot allocate array memory numpy numpy

Python MemoryError: cannot allocate array memory


With some help from @J.F. Sebastian I developed the following answer:

train = np.empty([7049,9246])row = 0for line in open("data/training_nohead.csv")    train[row] = np.fromstring(line, sep=",")    row += 1

Of course this answer assumed prior knowledge of the number of rows and columns. Should you not have this information before-hand, the number of rows will always take a while to calculate as you have to read the entire file and count the \n characters. Something like this will suffice:

num_rows = 0for line in open("data/training_nohead.csv")    num_rows += 1

For number of columns if every row has the same number of columns then you can just count the first row, otherwise you need to keep track of the maximum.

num_rows = 0max_cols = 0for line in open("data/training_nohead.csv")    num_rows += 1    tmp = line.split(",")    if len(tmp) > max_cols:        max_cols = len(tmp)

This solution works best for numerical data, as a string containing a comma could really complicate things.


This is an old discussion, but might help people in present.

I think I know why str = str + " " * 1000 fails fester than str = " " * 2048000000

When running the first one, I believe OS needs to allocate in memory the new object which is str + " " * 1000, and only after that it reference the name str to it. Before referencing the name 'str' to the new object, it cannot get rid of the first one.This means the OS needs to allocate about the 'str' object twice in the same time, making it able to do it just for 1 gig, instead of 2 gigs.I believe using the next code will get the same maximum memory out of your OS as in single allocation:

str = " " * 511000000while(1):    l = len(str)    str = " "    str = " " * (len + 1000)

Feel free to roccet me if I am wrong