Dynamically decide which GPU to run on - TF on NVIDIA docker
As described in the CUDA programming guide, the default device enumeration used by CUDA is "fastest first":
CUDA_DEVICE_ORDER
FASTEST_FIRST, PCI_BUS_ID, (default is FASTEST_FIRST)
FASTEST_FIRST causes CUDA to guess which device isfastest using a simple heuristic, and make that device 0, leaving theorder of the rest of the devices unspecified.
PCI_BUS_ID orders devices by PCI bus ID in ascending order.
If you set CUDA_DEVICE_ORDER=PCI_BUS_ID
the CUDA ordering will match the device ordering shown by nvidia-smi
.
Since you are using docker, you can also enforce a stronger isolation with our runtime:docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 ...
But that's at container startup time.