Difference between revisions of "Group 6"
(→Zhijian) |
(→Array Processing) |
||
Line 21: | Line 21: | ||
Flat profile: | Flat profile: | ||
− | |||
Each sample counts as 0.01 seconds. | Each sample counts as 0.01 seconds. | ||
% cumulative self self total | % cumulative self self total | ||
Line 29: | Line 28: | ||
0.00 1.49 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z4initPPfi | 0.00 1.49 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z4initPPfi | ||
− | + | Call graph | |
− | |||
− | |||
granularity: each sample hit covers 2 byte(s) for 0.67% of 1.49 seconds | granularity: each sample hit covers 2 byte(s) for 0.67% of 1.49 seconds | ||
index % time self children called name | index % time self children called name | ||
− | + | <spontaneous> | |
[1] 99.3 1.48 0.00 multiply(float**, float**, float**, int) [1] | [1] 99.3 1.48 0.00 multiply(float**, float**, float**, int) [1] | ||
----------------------------------------------- | ----------------------------------------------- | ||
− | + | <spontaneous> | |
[2] 0.7 0.01 0.00 init(float**, int) [2] | [2] 0.7 0.01 0.00 init(float**, int) [2] | ||
----------------------------------------------- | ----------------------------------------------- | ||
− | 0.00 0.00 1/1 | + | 0.00 0.00 1/1 __libc_csu_init [16] |
[10] 0.0 0.00 0.00 1 _GLOBAL__sub_I__Z4initPPfi [10] | [10] 0.0 0.00 0.00 1 _GLOBAL__sub_I__Z4initPPfi [10] | ||
----------------------------------------------- | ----------------------------------------------- | ||
� | � | ||
Index by function name | Index by function name | ||
− | |||
[10] _GLOBAL__sub_I__Z4initPPfi (arrayProcessing.cpp) [2] init(float**, int) [1] multiply(float**, float**, float**, int) | [10] _GLOBAL__sub_I__Z4initPPfi (arrayProcessing.cpp) [2] init(float**, int) [1] multiply(float**, float**, float**, int) | ||
Revision as of 23:08, 16 March 2019
GPU610/DPS915 | Student List | Group and Project Index | Student Resources | Glossary
Contents
Group 6
Team Members
Progress
Assignment 1 - Select and Assess
Array Processing
Subject: Array Processing
Blaise Barney introduced Parallel Computing https://computing.llnl.gov/tutorials/parallel_comp/ Array processing could become one of the parallel example, which "demonstrates calculations on 2-dimensional array elements; a function is evaluated on each array element."
Standard random method is used to initialize a 2-dimentional array. The purpose of this program is to perform a 2-dimension array calculation, which is a matrix-matrix multiplication in this example.
In this following profile example, n = 1000
Flat profile: Each sample counts as 0.01 seconds.
% cumulative self self total time seconds seconds calls Ts/call Ts/call name
100.11 1.48 1.48 multiply(float**, float**, float**, int)
0.68 1.49 0.01 init(float**, int) 0.00 1.49 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z4initPPfi
Call graph
granularity: each sample hit covers 2 byte(s) for 0.67% of 1.49 seconds
index % time self children called name
<spontaneous>
[1] 99.3 1.48 0.00 multiply(float**, float**, float**, int) [1]
<spontaneous>
[2] 0.7 0.01 0.00 init(float**, int) [2]
0.00 0.00 1/1 __libc_csu_init [16]
[10] 0.0 0.00 0.00 1 _GLOBAL__sub_I__Z4initPPfi [10]
� Index by function name
[10] _GLOBAL__sub_I__Z4initPPfi (arrayProcessing.cpp) [2] init(float**, int) [1] multiply(float**, float**, float**, int)
From the call graph, multiply() took major runtime more than 99%, as it contains 3 for-loop, which is O(n^3). Besides, init() also became the second busy one, which has a O(n^2).
As the calculation of elements is independent of one another - leads to an embarrassingly parallel solution. Arrays elements are evenly distributed so that each process owns a portion of the array (subarray). It can be solved in less time with multiple compute resources than with a single compute resource.
The Monte Carlo Simulation (PI Calculation)
Subject: The Monte Carlo Simulation (PI Calculation) Got the code from here: https://rosettacode.org/wiki/Monte_Carlo_methods#C.2B.2B A Monte Carlo Simulation is a way of approximating the value of a function where calculating the actual value is difficult or impossible.
It uses random sampling to define constraints on the value and then makes a sort of "best guess."
Zhijian
Subject: