Why is it faster to perform float by float matrix multiplication compared to int by int?

c++ numpy matrix eigen avx

All those vector-vector and matrix-vector operations are using BLAS internally. BLAS, optimized over decades for different archs, cpus, instructions and cache-sizes has no integer-type!

Here is some branch of OpenBLAS working on it (and some tiny discussion at google-groups linking it).

And i think i heard Intel's MKL (Intel's BLAS implementation) might be working on integer-types too. This talk looks interesting (mentioned in that forum), although it's short and probably more approaching small integral types useful in embedded Deep-Learning).

c++ numpy matrix eigen avx

If you compile these two simple functions which essentially just calculate a product (using the Eigen library)

#include <Eigen/Core>int mult_int(const Eigen::MatrixXi& A, Eigen::MatrixXi& B){    Eigen::MatrixXi C= A*B;    return C(0,0);}int mult_float(const Eigen::MatrixXf& A, Eigen::MatrixXf& B){    Eigen::MatrixXf C= A*B;    return C(0,0);}

using the flags -mavx2 -S -O3 you will see very similar assembler code, for the integer and the float version.The main difference however is that vpmulld has 2-3 times the latency and just 1/2 or 1/4 the throughput of vmulps. (On recent Intel architectures)

Reference: Intel Intrinsics Guide, "Throughput" means the reciprocal throughput, i.e., how many clock-cycles are used per operation, if no latency happens (somewhat simplified).

CodeHunter

Why is it faster to perform float by float matrix multiplication compared to int by int?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last