Changes

Jump to: navigation, search

GPU621/GPU Targeters

1,434 bytes added, 14:01, 18 December 2020
Instructions for AMD
3. Yunseon Lee
== Progress ==  == Difference of CPU and GPU for parallel applications (Yunseon) ==
CUDA version 9.2, using multiple P100 server GPUs, you can realize up to 50x performance improvements over CPUs.
 
 
'''OpenMP (Open MultiProcessing)'''
 
OpenMP is a parallel programming model based on compiler directives which allows application developers to incrementally add parallelism to their application codes.
 
OpenMP API specification for parallel programming provides an application programming interface (API) that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran, on most platforms. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.
Use OpenCL when you have existing code in that language and when you need portability to multiple platforms and devices. It runs on Windows, Linux and Mac OS, as well as a wide variety of hardware platforms (described above).
 '''OpenMP (Open MultiProcessing)'''OpenMP is a parallel programming model based on compiler directives which allows application developers to incrementally add parallelism to their application codes. OpenMP API specification for parallel programming provides an application programming interface (API) that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran, on most platforms. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior. Benefits of OpenMP. Why to choose over GPU kernel model?-supports multi-core, vectorization and GPU-allows for "teams of threads"-portable between various plaforms-heterogeneous memory allocation and custom data mappers [More information (compare OpenMP syntax with CUDA, HIP and other):https://stackoverflowgithub.com/questionsROCm-Developer-Tools/aomp/blob/master/7263193docs/opencl-vs-openmp-performance#7263823openmp_terms.md] 
== Programming GPUs with OpenMP ==
<h3>Target Region</h3>
== Instructions for NVIDEA NVIDIA ==
'''How to set up the compiler and target offloading for Linux with a target NVIDIA GPU'''
$ mv cfe-7.0.0.src llvm-7.0.0.src/tools/clang
$ mv openmp-7.0.0.src llvm-7.0.0.src/projects/openmp
$ sudo usermod -a -G video $USER
</pre>
$ clang -fopenmp -fopenmp-targets=nvptx64 -O2 foo.c
</pre>
 
== Instructions for AMD==
How to set up compiler and target offloading for Linux on AMD GPU: (Elena)
Note: user should be member of 'video' group; if this doesn't help, may add user to 'render' group
[AOMP https://github.com/ROCm-Developer-Tools/aomp] is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs.
'''Hello world compilation example:'''
 
<pre>
// File helloWorld.c
#include <omp.h>
#include <stdio.h>
int main()
{
#pragma omp parallel
{
printf("Hello world!");
}
}
</pre>
 
Make sure to export your new AOMP to PATH
<pre>
export AOMP="/usr/lib/aomp"
export PATH=$AOMP/bin:$PATH
 
clang -fopenmp helloWorld.c -o helloWorld
 
./helloWorld
</pre>
== Results and Graphs (Nathan/Elena) =='''Hello world on GPU example'''
<pre>
// File helloWorld.c
#include <omp.h>
#include <stdio.h>
int main(void)
{
#pragma omp target
#pragma omp parallel
printf("Hello world from GPU! THREAD %d\n", omp_get_thread_num());
}
</pre>
 
<pre>
export AOMP="/usr/lib/aomp"
export PATH=$AOMP/bin:$PATH
export LIBOMPTARGET_KERNEL_TRACE=1
 
clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 helloWorld.c -0 helloWorld
 
./helloWorld
</pre>
 
 
 
To see the name of your device for (-march=gfx803) you may run 'rocminfo' tool:
 
<pre>
$ /opt/rocm/bin/rocminfo
</pre>
== Conclusions (NathanIf further problems with compiling and running, try starting with examples:https:/Elena/Yunseon) ==github.com/ROCm-Developer-Tools/aomp/tree/master/examples/openmp
== Sources ==
51
edits

Navigation menu