Dynamically decide which GPU to run on - TF on NVIDIA docker Dynamically decide which GPU to run on - TF on NVIDIA docker docker docker

Dynamically decide which GPU to run on - TF on NVIDIA docker


As described in the CUDA programming guide, the default device enumeration used by CUDA is "fastest first":

CUDA_​DEVICE_​ORDER

FASTEST_FIRST, PCI_BUS_ID, (default is FASTEST_FIRST)

FASTEST_FIRST causes CUDA to guess which device isfastest using a simple heuristic, and make that device 0, leaving theorder of the rest of the devices unspecified.

PCI_BUS_ID orders devices by PCI bus ID in ascending order.

If you set CUDA_​DEVICE_​ORDER=PCI_BUS_ID the CUDA ordering will match the device ordering shown by nvidia-smi.

Since you are using docker, you can also enforce a stronger isolation with our runtime:
docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 ...
But that's at container startup time.