Changes

Jump to: navigation, search

GPU621/Intel Parallel Studio VTune Amplifier

18 bytes added, 00:13, 9 December 2021
Performance
====Performance====
As can be seen from the screenshot below, in the OpenMP solution, the work is spread unevenly between 8 threads. It can be described by the fact that the first node is responsible for initializing the arrays and single construct. Also, there is a lot of idle time due to the barrier construct. But the Prefix can itself seems to be spread almost evenlyfall into average optimal CPU utilization.
[[File:OMP_Scan.png]]
70
edits

Navigation menu