understanding numpy's dstack function

python numpy concatenation multidimensional-array

It's easier to understand what np.vstack, np.hstack and np.dstack* do by looking at the .shape attribute of the output array.

Using your two example arrays:

print(a.shape, b.shape)# (3, 2) (3, 2)

np.vstack concatenates along the first dimension...
```
print(np.vstack((a, b)).shape)# (6, 2)
```
np.hstack concatenates along the second dimension...
```
print(np.hstack((a, b)).shape)# (3, 4)
```
and np.dstack concatenates along the third dimension.
```
print(np.dstack((a, b)).shape)# (3, 2, 2)
```

Since a and b are both two dimensional, np.dstack expands them by inserting a third dimension of size 1. This is equivalent to indexing them in the third dimension with np.newaxis (or alternatively, None) like this:

print(a[:, :, np.newaxis].shape)# (3, 2, 1)

If c = np.dstack((a, b)), then c[:, :, 0] == a and c[:, :, 1] == b.

You could do the same operation more explicitly using np.concatenate like this:

print(np.concatenate((a[..., None], b[..., None]), axis=2).shape)# (3, 2, 2)

* Importing the entire contents of a module into your global namespace using import * is considered bad practice for several reasons. The idiomatic way is to import numpy as np.

python numpy concatenation multidimensional-array

Let x == dstack([a, b]). Then x[:, :, 0] is identical to a, and x[:, :, 1] is identical to b. In general, when dstacking 2D arrays, dstack produces an output such that output[:, :, n] is identical to the nth input array.

If we stack 3D arrays rather than 2D:

x = numpy.zeros([2, 2, 3])y = numpy.ones([2, 2, 4])z = numpy.dstack([x, y])

then z[:, :, :3] would be identical to x, and z[:, :, 3:7] would be identical to y.

As you can see, we have to take slices along the third axis to recover the inputs to dstack. That's why dstack behaves the way it does.

python numpy concatenation multidimensional-array

I'd like to take a stab at visually explaining this (even though the accepted answer makes enough sense, it took me a few seconds to rationalise this to my mind).If we imagine the 2d-arrays as a list of lists, where the 1st axis gives one of the inner lists and the 2nd axis gives the value in that list, then the visual representation of the OP's arrays will be this:

a = [      [0, 3],      [1, 4],      [2, 5]    ]b = [      [6,  9],      [7, 10],      [8, 11]    ]# Shape of each array is [3,2]

Now, according to the current documentation, the dstack function adds a 3rd axis, which means each of the arrays end up looking like this:

a = [      [[0], [3]],      [[1], [4]],      [[2], [5]]    ]b = [      [[6],  [9]],      [[7], [10]],      [[8], [11]]    ]# Shape of each array is [3,2,1]

Now, stacking both these arrays in the 3rd dimension simply means that the result should look, as expected, like this:

dstack([a,b]) = [                  [[0, 6], [3, 9]],                  [[1, 7], [4, 10]],                  [[2, 8], [5, 11]]                ]# Shape of the combined array is [3,2,2]

Hope this helps.

CodeHunter

understanding numpy's dstack function

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last