Changes

GPU621/To Be Announced

1,030 bytes added, 23:25, 29 November 2020

→‎Programming GPUs with OpenMP

<pre>

// Offloading to the target device, but still without parallelism.

#pragma omp target map(to:A,B), map(tofrom:sum)

{

}

</pre>

<h3>Dynamically allocated data</h3>

If we have dynamically allocated data in the host region that we'd like to map to the target region. Then in the map clause we'll need to specify the number of elements that we'd like to copy over. Otherwise all the compiler would have is a pointer to some region in memory. As it would require the size of allocated memory that needs to be mapped over to the target device.

<pre>

int* a = (int*)malloc(sizeof(int) * N);

#pragma omp target map(to: a[0:N]) // [start:length]

</pre>

<h3>Target data regions</h3>

<h3>Teams construct</h3>

<h3>Declare Target</h3>

''Calling functions within the scope of a target region.''

* The ''declare target'' construct will compile a version of a function that can be called on the device.

* In order to offload a function onto the target's device region the function must be first declare on the target.

<pre>

#pragma omp declare target

int combine(int a, int b);

#pragma omp end declare target

#pragma omp target teams distribute parallel for \

map(to: A, B), map(tofrom:sum), reduction(+:sum)

for (int i = 0; i < N; i++) {

sum += combine(A[i], B[i])

}

</pre>

Nolah

24

edits

Changes

GPU621/To Be Announced

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools