Open main menu

CDOT Wiki β

Changes

The parallelizing Express

10,760 bytes added, 17:47, 21 February 2017
Assignment 1
----
 
=== Sorted array processing by Abbas Zoeb ===
 
Sorted array processing finds the processing time it takes for an array before and after it has been sorted. This code basically allocates an array with random numbers (array size is 32768). Then the array is sorted using the standard library inbuilt sorting algorithm (which most likely is using Quick Sort). Then another loop is run through the entire array to find all the elements that are above the number '127'. This loop is inside another loop that rotates for another 100,000 times. The loop would go by faster as the array is already sorted and would be much easier now to find elements greater than 127. Profiling was performed on this c++ code on matrix. The project can be found here: [https://gist.github.com/raul-hc/a427b2f602a2c5cd46f4da0d803542ca]
 
==== Running the Program ====
 
The file can be simply compiled using g++ or clang++ on matrix.
 
Steps to compile the code and do the profiling on it:
 
0)Copy the code from GitHub and paste it into a cpp file (I named it sorted.cpp)
 
1) g++ -Wall -pg sorted.cpp -o sorted
 
2) ./sorted
 
3) gprof sorted gmon.out > prof_output
 
4) cat prof_output
 
Now you can see the flat and graph profiling for the code.
 
 
==== The Code ====
 
#include <algorithm>
#include <ctime>
#include <iostream>
 
int main()
{
// Generate data
const unsigned arraySize = 32768;
int data[arraySize];
 
for (unsigned c = 0; c < arraySize; ++c)
data[c] = std::rand() % 256;
 
// The next loop runs faster because we are using a sorted array
std::sort(data, data + arraySize);
 
// Test
clock_t start = clock();
long long sum = 0;
 
for (unsigned i = 0; i < 100000; ++i)
{
// Primary loop
for (unsigned c = 0; c < arraySize; ++c)
{
if (data[c] >= 128)
sum += data[c];
}
}
 
double elapsedTime = static_cast<double>(clock() - start) / CLOCKS_PER_SEC;
 
std::cout << elapsedTime << std::endl;
std::cout << "sum = " << sum << std::endl;
}
 
==== Analysis ====
 
Here we can see the profiling as below:
 
Flat profile:
 
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
99.93 14.89 14.89 main
0.07 14.90 0.01 120348 0.00 0.00 void std::__iter_swap<true>::iter_swap<int*, int*>(int*, int*)
0.00 14.90 0.00 120348 0.00 0.00 void std::swap<int>(int&, int&)
0.00 14.90 0.00 120348 0.00 0.00 void std::iter_swap<int*, int*>(int*, int*)
0.00 14.90 0.00 32767 0.00 0.00 void std::__unguarded_linear_insert<int*>(int*)
0.00 14.90 0.00 2963 0.00 0.00 void std::__move_median_first<int*>(int*, int*, int*)
0.00 14.90 0.00 2963 0.00 0.00 int* std::__unguarded_partition<int*, int>(int*, int*, int const&)
0.00 14.90 0.00 2963 0.00 0.00 int* std::__unguarded_partition_pivot<int*>(int*, int*)
0.00 14.90 0.00 1 0.00 0.00 _GLOBAL__sub_I_main
0.00 14.90 0.00 1 0.00 0.00 __static_initialization_and_destruction_0(int, int)
0.00 14.90 0.00 1 0.00 0.00 void std::__insertion_sort<int*>(int*, int*)
0.00 14.90 0.00 1 0.00 10.00 void std::__introsort_loop<int*, int>(int*, int*, int)
0.00 14.90 0.00 1 0.00 0.00 void std::__final_insertion_sort<int*>(int*, int*)
0.00 14.90 0.00 1 0.00 0.00 void std::__unguarded_insertion_sort<int*>(int*, int*)
0.00 14.90 0.00 1 0.00 0.00 std::__lg(int)
0.00 14.90 0.00 1 0.00 10.00 void std::sort<int*>(int*, int*)
Call graph
 
 
granularity: each sample hit covers 4 byte(s) for 0.07% of 14.90 seconds
 
index % time self children called name
<spontaneous>
[1] 100.0 14.89 0.01 main [1]
0.00 0.01 1/1 void std::sort<int*>(int*, int*) [6]
-----------------------------------------------
0.01 0.00 120348/120348 void std::iter_swap<int*, int*>(int*, int*) [3]
[2] 0.1 0.01 0.00 120348 void std::__iter_swap<true>::iter_swap<int*, int*>(int*, int*) [2]
0.00 0.00 120348/120348 void std::swap<int>(int&, int&) [12]
-----------------------------------------------
0.00 0.00 2746/120348 void std::__move_median_first<int*>(int*, int*, int*) [8]
0.00 0.01 117602/120348 int* std::__unguarded_partition<int*, int>(int*, int*, int const&) [7]
[3] 0.1 0.00 0.01 120348 void std::iter_swap<int*, int*>(int*, int*) [3]
0.01 0.00 120348/120348 void std::__iter_swap<true>::iter_swap<int*, int*>(int*, int*) [2]
-----------------------------------------------
0.00 0.01 2963/2963 void std::__introsort_loop<int*, int>(int*, int*, int) [5]
[4] 0.1 0.00 0.01 2963 int* std::__unguarded_partition_pivot<int*>(int*, int*) [4]
0.00 0.01 2963/2963 int* std::__unguarded_partition<int*, int>(int*, int*, int const&) [7]
0.00 0.00 2963/2963 void std::__move_median_first<int*>(int*, int*, int*) [8]
-----------------------------------------------
2963 void std::__introsort_loop<int*, int>(int*, int*, int) [5]
0.00 0.01 1/1 void std::sort<int*>(int*, int*) [6]
[5] 0.1 0.00 0.01 1+2963 void std::__introsort_loop<int*, int>(int*, int*, int) [5]
0.00 0.01 2963/2963 int* std::__unguarded_partition_pivot<int*>(int*, int*) [4]
2963 void std::__introsort_loop<int*, int>(int*, int*, int) [5]
-----------------------------------------------
0.00 0.01 1/1 main [1]
[6] 0.1 0.00 0.01 1 void std::sort<int*>(int*, int*) [6]
0.00 0.01 1/1 void std::__introsort_loop<int*, int>(int*, int*, int) [5]
0.00 0.00 1/1 std::__lg(int) [19]
0.00 0.00 1/1 void std::__final_insertion_sort<int*>(int*, int*) [17]
-----------------------------------------------
0.00 0.01 2963/2963 int* std::__unguarded_partition_pivot<int*>(int*, int*) [4]
[7] 0.1 0.00 0.01 2963 int* std::__unguarded_partition<int*, int>(int*, int*, int const&) [7]
0.00 0.01 117602/120348 void std::iter_swap<int*, int*>(int*, int*) [3]
-----------------------------------------------
0.00 0.00 2963/2963 int* std::__unguarded_partition_pivot<int*>(int*, int*) [4]
[8] 0.0 0.00 0.00 2963 void std::__move_median_first<int*>(int*, int*, int*) [8]
0.00 0.00 2746/120348 void std::iter_swap<int*, int*>(int*, int*) [3]
-----------------------------------------------
0.00 0.00 120348/120348 void std::__iter_swap<true>::iter_swap<int*, int*>(int*, int*) [2]
[12] 0.0 0.00 0.00 120348 void std::swap<int>(int&, int&) [12]
-----------------------------------------------
0.00 0.00 15/32767 void std::__insertion_sort<int*>(int*, int*) [16]
0.00 0.00 32752/32767 void std::__unguarded_insertion_sort<int*>(int*, int*) [18]
[13] 0.0 0.00 0.00 32767 void std::__unguarded_linear_insert<int*>(int*) [13]
-----------------------------------------------
0.00 0.00 1/1 __do_global_ctors_aux [34]
[14] 0.0 0.00 0.00 1 _GLOBAL__sub_I_main [14]
0.00 0.00 1/1 __static_initialization_and_destruction_0(int, int) [15]
-----------------------------------------------
0.00 0.00 1/1 _GLOBAL__sub_I_main [14]
[15] 0.0 0.00 0.00 1 __static_initialization_and_destruction_0(int, int) [15]
-----------------------------------------------
0.00 0.00 1/1 void std::__final_insertion_sort<int*>(int*, int*) [17]
[16] 0.0 0.00 0.00 1 void std::__insertion_sort<int*>(int*, int*) [16]
0.00 0.00 15/32767 void std::__unguarded_linear_insert<int*>(int*) [13]
-----------------------------------------------
0.00 0.00 1/1 void std::sort<int*>(int*, int*) [6]
[17] 0.0 0.00 0.00 1 void std::__final_insertion_sort<int*>(int*, int*) [17]
0.00 0.00 1/1 void std::__insertion_sort<int*>(int*, int*) [16]
0.00 0.00 1/1 void std::__unguarded_insertion_sort<int*>(int*, int*) [18]
-----------------------------------------------
0.00 0.00 1/1 void std::__final_insertion_sort<int*>(int*, int*) [17]
[18] 0.0 0.00 0.00 1 void std::__unguarded_insertion_sort<int*>(int*, int*) [18]
0.00 0.00 32752/32767 void std::__unguarded_linear_insert<int*>(int*) [13]
-----------------------------------------------
0.00 0.00 1/1 void std::sort<int*>(int*, int*) [6]
[19] 0.0 0.00 0.00 1 std::__lg(int) [19]
-----------------------------------------------
Index by function name
 
[14] _GLOBAL__sub_I_main [7] int* std::__unguarded_partition<int*, int>(int*, int*, int const&) [6] void std::sort<int*>(int*, int*)
[15] __static_initialization_and_destruction_0(int, int) [17] void std::__final_insertion_sort<int*>(int*, int*) [12] void std::swap<int>(int&, int&)
[2] void std::__iter_swap<true>::iter_swap<int*, int*>(int*, int*) [13] void std::__unguarded_linear_insert<int*>(int*) [3] void std::iter_swap<int*, int*>(int*, int*)
[16] void std::__insertion_sort<int*>(int*, int*) [18] void std::__unguarded_insertion_sort<int*>(int*, int*) [1] main
[5] void std::__introsort_loop<int*, int>(int*, int*, int) [4] int* std::__unguarded_partition_pivot<int*>(int*, int*)
[8] void std::__move_median_first<int*>(int*, int*, int*) [19] std::__lg(int)
 
 
***The compilation of the code takes roughly 14 seconds and when the code is profiled we can see that the main uses most of the % processing time. i.e., 99.93%.
16
edits