
Jump to: navigation, search

GPU621/Intel Parallel Studio VTune Amplifier

354 bytes added, 23:35, 8 December 2021
As can be seen from the screenshot below, in the OpenMP solution, the work is spread unevenly between 8 threads. It can be described by the fact that the first node is responsible for initializing the arrays and single construct. Also, there is a lot of idle time due to the barrier construct. But the Prefix can itself seems to be spread almost evenly.

Navigation menu