Can OpenMP be used for GPUs?

multithreading fortran gpu openmp openacc

Yes. The OpenMP 4 target constructs were designed to support a wide range of accelerators. Compiler support for NVIDIA GPUs is available from GCC 7+ (see 1 and 2, although the latter has not been updated to reflect OpenMP 4 GPU support), Clang (see 3,4,5), and Cray. Compiler support for Intel GPUs is available in the Intel C/C++ compiler (see e.g. 6).

The IBM-developed Clang/LLVM implementation of OpenMP 4+ for NVIDIA GPUs is available from https://github.com/clang-ykt. The build recipe is provided in "OpenMP compiler for CORAL/OpenPower Heterogeneous Systems".

The Cray compiler supports OpenMP target for NVIDIA GPUs. From Cray Fortran Reference Manual (8.5):

The OpenMP 4.5 target directives are supported for targeting NVIDIA GPUs or the current CPU target. An appropriate accelerator target module must be loaded to use target directives.

The Intel compiler supports OpenMP target for Intel Gen graphics for C/C++ but not Fortran. Furthermore, the teams and distribute clauses are not supported because they are not necessary/appropriate. Below is a simple example showing how the OpenMP target features work in different environments.

void vadd2(int n, float * a, float * b, float * c){    #pragma omp target map(to:n,a[0:n],b[0:n]) map(from:c[0:n])#if defined(__INTEL_COMPILER) && defined(__INTEL_OFFLOAD)    #pragma omp parallel for simd#else    #pragma omp teams distribute parallel for simd#endif    for(int i = 0; i < n; i++)        c[i] = a[i] + b[i];}

The compiler options for Intel and GCC are as follows. I don't have GCC setup for NVIDIA GPUs but you can see the documentation for the appropriate -foffload options.

$ icc -std=c99 -qopenmp -qopenmp-offload=gfx -c vadd2.c && echo "SUCCESS" || echo "FAIL"SUCCESS$ gcc-7 -fopenmp -c vadd2.c && echo "SUCCESS" || echo "FAIL"SUCCESS

multithreading fortran gpu openmp openacc

OpenMP 4.0 standard includes support of accelerators (GPU, DSP, Xeon Phi, and so on), but I don't know any existence implementation of OpenMP 4.0 standard for GPU, only early experience.
OpenACC is indeed similar to OpenMP and easy to use. Good OpenACC tutorial: part 1 and part 2.

Unfortunately, I think there is no portable solution for CPU and GPU, at least for now (except for OpenCL, but it is too low level compare to OpenMP and OpenACC).

If you need portable solution, you could consider using Intel Xeon Phi accelerator instead of GPU. Intel Fortran (and C/C++) compiler includes OpenMP support both for CPU and Xeon Phi.

In addition, to create a really portable solution, it is not enough to use suitable parallel technology. You have to modify your program in order to provide enough level of parallelism. See "Structured Parallel Programming" or similar books for examples of possible approaches.

multithreading fortran gpu openmp openacc

To add to what was said about support on other platforms above: IBM is contributing to two OpenMP 4.5 compilers: One is the open source Clang/LLVM one. The other is IBM's XL compiler. Both compilers share the same helper OpenMP offloading library, but differ in the compiler's code generation and optimization for the GPU. For Fortran, the XL Fortran compiler supports a large subset of OpenMP 4.5 offloading to NVIDIA GPUs, starting in version 15.1.5. (And version 13.1.5 for XL C/C++). More features are being added this year and next year, with the aim of complete support in 2018. If you're on POWER, you can join the XL compiler beta program to get access to our latest OpenMP offloading features in Fortran and C/C++.

CodeHunter

Can OpenMP be used for GPUs?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last