Best practices with reading and operating on fortran ordered arrays with numpy
Numpy abstracts away the difference between Fortran ordering and C-ordering at the python level. (In fact, you can even have other orderings for >2d arrays with numpy. They're all treated the same at the python level.)
The only time you'll need to worry about C vs F ordering is when you're reading/writing to disk or passing the array to lower-level functions.
A Simple Example
As an example, let's make a simple 3D array in both C order and Fortran order:
In [1]: import numpy as npIn [2]: c_order = np.arange(27).reshape(3,3,3)In [3]: f_order = c_order.copy(order='F')In [4]: c_orderOut[4]: array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]], [[ 9, 10, 11], [12, 13, 14], [15, 16, 17]], [[18, 19, 20], [21, 22, 23], [24, 25, 26]]])In [5]: f_orderOut[5]: array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]], [[ 9, 10, 11], [12, 13, 14], [15, 16, 17]], [[18, 19, 20], [21, 22, 23], [24, 25, 26]]])
Notice that they both look identical (they are at the level we're interacting with them). How can you tell that they're in different orderings? First off, let's take a look at the flags (pay attention to C_CONTIGUOUS
vs F_CONTIGUOUS
):
In [6]: c_order.flagsOut[6]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : FalseIn [7]: f_order.flagsOut[7]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
And if you don't trust the flags, you can effectively view the memory order by looking at arr.ravel(order='K')
. The order='K'
is important. Otherwise, when you call arr.ravel()
the output will be in C-order regardless of the memory layout of the array. order='K'
uses the memory layout.
In [8]: c_order.ravel(order='K')Out[8]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26])In [9]: f_order.ravel(order='K')Out[9]: array([ 0, 9, 18, 3, 12, 21, 6, 15, 24, 1, 10, 19, 4, 13, 22, 7, 16, 25, 2, 11, 20, 5, 14, 23, 8, 17, 26])
The difference is actually represented (and stored) in the strides
of the array. Notice that c_order
's strides are (72, 24, 8)
, while f_order
's strides are (8, 24, 72)
.
Just to prove that the indexing works the same way:
In [10]: c_order[0,1,2]Out[10]: 5In [11]: f_order[0,1,2]Out[11]: 5
Reading and Writing
The main place where you'll run into problems with this is when you're reading from or writing to disk. Many file formats expect a particular ordering. I'm guessing that you're working with seismic data formats, and most of them (e.g. Geoprobe .vol's, and I think Petrel's volume format as well) essentially write a binary header and then a Fortran-ordered 3D array to disk.
With that in mind, I'll use a small seismic cube (snippet of some data from my dissertation) as an example.
Both of these are binary arrays of uint8
s with a shape of 50x100x198. One is in C-order, while the other is in Fortran-order. c_order.dat f_order.dat
To read them in:
import numpy as npshape = (50, 100, 198)c_order = np.fromfile('c_order.dat', dtype=np.uint8).reshape(shape)f_order = np.fromfile('f_order.dat', dtype=np.uint8).reshape(shape, order='F')assert np.all(c_order == f_order)
Notice that the only difference is specifying the memory layout to reshape
. The memory layout of the two arrays is still different (reshape doesn't make a copy), but they're treated identically at the python level.
Just to prove that the files really are written in a different order:
In [1]: np.fromfile('c_order.dat', dtype=np.uint8)[:10]Out[1]: array([132, 142, 107, 204, 37, 37, 217, 37, 82, 60], dtype=uint8)In [2]: np.fromfile('f_order.dat', dtype=np.uint8)[:10]Out[2]: array([132, 129, 140, 138, 110, 88, 110, 124, 142, 139], dtype=uint8)
Let's visualize the result:
def plot(data): fig, axes = plt.subplots(ncols=3) for i, ax in enumerate(axes): slices = [slice(None), slice(None), slice(None)] slices[i] = data.shape[i] // 2 ax.imshow(data[tuple(slices)].T, cmap='gray_r') return figplot(c_order).suptitle('C-ordered array')plot(f_order).suptitle('F-ordered array')plt.show()
Notice that we indexed them the same way, and they're displayed identically.
Common Mistakes with IO
First off, let's try reading in the Fortran-ordered file as if it were C-ordered and then take a look at the result (using the plot
function above):
wrong_order = np.fromfile('f_order.dat', dtype=np.uint8).reshape(shape)plot(wrong_order)
Not so good!
You mentioned that you're having to use "reversed" indicies. This is probably because you fixed what happened in the figure above by doing something like (note the reversed shape!):
c_order = np.fromfile('c_order.dat', dtype=np.uint8).reshape([50,100,198])rev_f_order = np.fromfile('f_order.dat', dtype=np.uint8).reshape([198,100,50])
Let's visualize what happens:
plot(c_order).suptitle('C-ordered array')plot(rev_f_order).suptitle('Incorrectly read Fortran-ordered array')
Note that the image on the far right (the timeslice) of the first plot matches a transposed version of the image on the far left of the second.
Similarly, print rev_f_order[1,2,3]
and print c_order[3,2,1]
both yield 140
, while indexing them the same way gives a different result.
Basically, this is where your reversed indices come from. Numpy thinks it's a C-ordered array with a different shape. Notice if we look at the flags, they're both C-contiguous in memory:
In [24]: rev_f_order.flagsOut[24]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : FalseIn [25]: c_order.flagsOut[25]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
This is because a fortran-ordered array is equivalent to a C-ordered array with the reverse shape.
Writing to Disk in Fortran-Order
There's an additional wrinkle when writing a numpy array to disk in Fortran-order.
Unless you specify otherwise, the array will be written in C-order regardless of its memory-layout! (There's a clear note about this in the documentation for ndarray.tofile
, but it's a common gotcha. The opposite behavior would be incorrect, though, i.m.o.)
Therefore, regardless of the memory layout of an array, to write it to disk in Fortran order, you need to do:
arr.ravel(order='F').tofile('output.dat')
If you're writing it as ascii, the same applies. Use ravel(order='F')
and then write out the 1-dimensional result.