Vectorising numpy.einsum
One way would be using all the dimensions for T
-
np.einsum('Hr, Ar, Dr, ATr ->HADT', H, A, D, T)
Since, we need to sum-reduce axis-r
across all inputs, while keeping all others (axes) in the output, I don't see any intermediate way of doing it/bringing in any dot-based tools on this to leverage BLAS.