Why is the GNU scientific library matrix multiplication slower than numpy.matmul? Why is the GNU scientific library matrix multiplication slower than numpy.matmul? numpy numpy