Open main menu

CDOT Wiki β

Changes

GPU610/TeamEh

2,705 bytes added, 14:06, 3 October 2014
Progress
The functions that perform the filtering are <code>Gauss_filter::smooth_ord</code>, <code>unsharp</code> and <code>Use_kernel::new_im()</code>. These functions are all O(r x c) with respect to image dimensions and thus where the biggest gains from parallelization will be found.
 
==== Bradly Hoovers Results ====
 
===== Introduction =====
 
The SHA-1 algorithm used in this project was implemented by Paul E. Jones[1]. It is a C++ version of the algorithm. The recursive permutation algorithm was taken from a user submission[2] on stack exchange.
 
[1]http://www.packetizer.com/security/sha1/
 
[2] http://codereview.stackexchange.com/questions/38474/brute-force-algorithm-in-c
 
After some tweaking to integrate the two to work in conjunction, I ran the program using the Upper and lower case alphabet for the permutation, with a length of 5, as input. The length and character set are hard coded.
 
===== Length of 5, upper and lowercase =====
Command: ./brutis
 
Each sample counts as 0.01 seconds.
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
83.38 109.96 109.96 387659012 0.00 0.00 SHA1::ProcessMessageBlock()
8.77 121.52 11.56 387659012 0.00 0.00 SHA1::PadMessage()
4.01 126.81 5.28 1930693908 0.00 0.00 SHA1::Input(unsigned char const*, unsigned int)
1.48 128.76 1.96 387659012 0.00 0.00 SHA1::operator<<(char const*)
1.19 130.33 1.57 387659012 0.00 0.00 SHA1::Result(unsigned int*)
0.73 131.29 0.96 387659012 0.00 0.00 SHA1::Reset()
0.36 131.76 0.47 5 0.09 26.35 SHA1::check(char*, int, int, int, char const*)
0.05 131.83 0.07 SHA1::~SHA1()
0.03 131.88 0.05 SHA1::SHA1()
0.03 131.92 0.05 SHA1::operator<<(unsigned char)
0.02 131.94 0.02 SHA1::Input(char)
0.00 131.94 0.00 1 0.00 0.00 _GLOBAL__sub_I__ZN4SHA1C2Ev
0.00 131.94 0.00 1 0.00 0.00 _GLOBAL__sub_I_main
 
 
===== Summary =====
 
The total runtime for this test was approximately 132 seconds. Using Amdahl's law to calculate the speed up I would obtain on my laptop, the equation is:
 
S384 = 1 / ((1 - 0.8338) + (0.8338/384)) = 5.9393
The maximum expected speed is 5.9393. My laptop’s 650M GPU only has 384 cores. This is not a significant increase in speed.
 
Using my desktop’s GTX780 has 2304 core. Using my desktop’s gpu the resulting speed up would be:
 
S2304 = 1 / ((1 - 0.8338) + (0.8338/2304)) = 6.004
 
After observing these results, and further analysis of the algorithm, I have found that the SHA-1 algorithm is a sequential algorithm not entirely suitable for parallelisation.
 
Due to this, I choose Ben's image processing for parallelisation.
 
=== Assignment 2 ===
=== Assignment 3 ===