Changes

Jump to: navigation, search

Group 6

32 bytes added, 23:10, 16 March 2019
Array Processing
In this following profile example, n = 1000
<pre>
Flat profile:
Each sample counts as 0.01 seconds.
0.68 1.49 0.01 init(float**, int)
0.00 1.49 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z4initPPfi
</pre>
<pre>
Call graph
Index by function name
[10] _GLOBAL__sub_I__Z4initPPfi (arrayProcessing.cpp) [2] init(float**, int) [1] multiply(float**, float**, float**, int)
</pre>From the call graph, multiply() took major runtime to more than 99%, as it contains 3 for-loop, which T(n) is O(n^3). Besides, init() also became the second busy one, which has a O(n^2).
As the calculation of elements is independent of one another - leads to an embarrassingly parallel solution. Arrays elements are evenly distributed so that each process owns a portion of the array (subarray). It can be solved in less time with multiple compute resources than with a single compute resource.
57
edits

Navigation menu