How-to run TensorFlow on multiple core and threads How-to run TensorFlow on multiple core and threads python python

How-to run TensorFlow on multiple core and threads


According to Tensorflow:

The two configurations listed below are used to optimize CPU performance by adjusting the thread pools.

  • intra_op_parallelism_threads: Nodes that can use multiple threads to parallelize their execution will schedule the individual pieces into this pool.
  • inter_op_parallelism_threads: All ready nodes are scheduled in this pool.

These configurations are set via the tf.ConfigProto and passed to tf.Session in the config attribute as shown in the snippet below. For both configuration options, if they are unset or set to 0, will default to the number of logical CPU cores. Testing has shown that the default is effective for systems ranging from one CPU with 4 cores to multiple CPUs with 70+ combined logical cores. A common alternative optimization is to set the number of threads in both pools equal to the number of physical cores rather than logical cores

config = tf.ConfigProto()config.intra_op_parallelism_threads = 44config.inter_op_parallelism_threads = 44tf.session(config=config)

In versions of TensorFlow before 1.2, It is recommended using multi-threaded, queue-based input pipelines for performance. Beginning with TensorFlow 1.4, however, It is recommended using the tf.data module instead.


Yes, in Linux, you can check your CPU usage with top and press 1 to show the usage per CPU. note: The percentage depends on the Irix/Solaris mode.


This is a comment, but I'm posting it as an answer because I don't have enough rep to post comments yet. Marco D.G.'s answer is correct, I just wanted to add the fun-fact that with tf.device('/cpu:0') automatically tries to use all available cores. Happy flowing!


For me, it worked this way:

from multiprocessing.dummy import Pool as ThreadPool ....pool = ThreadPool()outputs = pool.starmap(run_on_sess,[(tf_vars,data1),(tf_vars,data2),])pool.close()pool.join()

You should initialize the session and make session related variables available globally as a part of tf_vars. Create a run_on_sess function that'll perform the sess.run step and other posterior computations for a single batchs named data1 and data2 in a pythonic multithreaded environment.