Read a binary file using Numpy fromfile and a given offset Read a binary file using Numpy fromfile and a given offset numpy numpy

Read a binary file using Numpy fromfile and a given offset


You can open the file with a standard python file open, then seek to skip the header, then pass in the file object to fromfile. Something like this:

import numpy as npimport osdtype = np.dtype([    ("time", np.float32),    ("PosX", np.float32),    ("PosY", np.float32),    ("Alt", np.float32),    ("Qx", np.float32),    ("Qy", np.float32),    ("Qz", np.float32),    ("Qw", np.float32),    ("dist", np.float32),])f = open("myfile", "rb")f.seek(1863, os.SEEK_SET)data = np.fromfile(f, dtype=dtype)print x 


I faced a similar problem, but none of the answers above satisfied me.I needed to implement something like virtual table with a very big number of binary records that potentially occupied more memory than I can afford in one numpy array. So my question was how to read and write a small set of integers from/to a binary file - a subset of a file into a subset of numpy array.

This is a solution that worked for me:

import numpy as nprecordLen = 10 # number of int64's per recordrecordSize = recordLen * 8 # size of a record in bytesmemArray = np.zeros(recordLen, dtype=np.int64) # a buffer for 1 record# Create a binary file and open it for write+readwith open('BinaryFile.dat', 'w+b') as file:    # Writing the array into the file as record recordNo:    recordNo = 200 # the index of a target record in the file    file.seek(recordSize * recordNo)    bytes = memArray.tobytes()    file.write(bytes)    # Reading a record recordNo from file into the memArray    file.seek(recordSize * recordNo)    bytes = file.read(recordSize)    memArray = np.frombuffer(bytes, dtype=np.int64).copy()    # Note copy() added to make the memArray mutable


I suggest using numpy frombuffer:

with open(file_path, 'rb') as file_obj:    file_obj.seek(seek_to_position)    data_ro = np.frombuffer(file_obj.read(total_num_bytes), dtype=your_dtype_here)    data_rw = data_ro.copy() #without copy(), the result is read-only