How-to run TensorFlow on multiple core and threads
According to Tensorflow:
The two configurations listed below are used to optimize CPU performance by adjusting the thread pools.
intra_op_parallelism_threads
: Nodes that can use multiple threads to parallelize their execution will schedule the individual pieces into this pool.inter_op_parallelism_threads
: All ready nodes are scheduled in this pool.These configurations are set via the
tf.ConfigProto
and passed totf.Session
in theconfig
attribute as shown in the snippet below. For both configuration options, if they are unset or set to 0, will default to the number of logical CPU cores. Testing has shown that the default is effective for systems ranging from one CPU with 4 cores to multiple CPUs with 70+ combined logical cores. A common alternative optimization is to set the number of threads in both pools equal to the number of physical cores rather than logical cores
config = tf.ConfigProto()config.intra_op_parallelism_threads = 44config.inter_op_parallelism_threads = 44tf.session(config=config)
In versions of TensorFlow before 1.2, It is recommended using multi-threaded, queue-based input pipelines for performance. Beginning with TensorFlow 1.4, however, It is recommended using the tf.data module instead.
Yes, in Linux, you can check your CPU usage with top
and press 1 to show the usage per CPU. note: The percentage depends on the Irix/Solaris mode.
This is a comment, but I'm posting it as an answer because I don't have enough rep to post comments yet. Marco D.G.'s answer is correct, I just wanted to add the fun-fact that with tf.device('/cpu:0')
automatically tries to use all available cores. Happy flowing!
For me, it worked this way:
from multiprocessing.dummy import Pool as ThreadPool ....pool = ThreadPool()outputs = pool.starmap(run_on_sess,[(tf_vars,data1),(tf_vars,data2),])pool.close()pool.join()
You should initialize the session and make session related variables available globally as a part of tf_vars
. Create a run_on_sess
function that'll perform the sess.run
step and other posterior computations for a single batchs named data1 and data2 in a pythonic multithreaded environment.