Changes

Jump to: navigation, search

GPU621/Intel Parallel Studio VTune Amplifier

86 bytes added, 19:58, 8 December 2021
Performance
</source>
====Performance====
As expected, the serail version CPU utilization is considered poor due to the fact that only one thread is used for data utilization and Prefix Scan Algorithm. As can be seen from the Hotspot report the main function took 2.297 under Intel compiler with no optimization. Interestingly, the deallocation is also taken a lot of CPU time with 0.6833 seconds.
[[File:Serial_Scan.png]]
70
edits

Navigation menu