Resetting GPU and driver after CUDA error
Edit:
If you are on Tesla hardware on Linux and can run nvidia-smi, then you can reset the GPU using
nvidia-smi -r
or
nvidia-smi --gpu-reset
Here is the man
output for this switch:
Resets GPU state. Can be used to clear double bit ECC errors or recover hung GPU. Requires -i switch to target specific device. Available on Linux only.
Otherwise...
The way to truly reset the hardware is to reboot.
What you describe shouldn't happen. I recommend testing with different hardware and let us know if it still occurs.