Pickle file size when pickling numpy arrays or lists Pickle file size when pickling numpy arrays or lists arrays arrays

Pickle file size when pickling numpy arrays or lists


If you want to store numpy arrays on disk you shouldn't be using pickle at all. Investigate numpy.save() and its kin.

If you are using pandas then it too has its own methods. You might want to consult this article or the answer to this question for better techniques.


If the data you provided is close to accurate, this seems like premature optimization to me, as that is really not a lot of data, and supposedly only integers. I am pickling a file right now with millions of entries, of strings and integers, and then you can worry about optimization. In your case the difference likely does not matter that much, especially if this is run manually and does not feed into some webapp or similar.