Open main menu

CDOT Wiki β

Changes

ParaCode

2,738 bytes added, 17:05, 8 April 2018
Resources
Through the profile, we can see that the most cost of the runtime is 'grep()' and 'convert_char_to_string()' functions. <br>
 
''' Basic Variables '''
First, I look at function 'convert_char_to_string()'. I think it's no necessary to use this function, so I remove it.<br>
[[File:StringremoveJPG.JPG|650px]]
 
'''Serial grep method'''
Simple? But to parallel it is not that simple...<br>
 
'''Parallel grep idea'''
Although we cannot compare a whole string, it is possible to compare by one character. For example, it is possible to find just 'h' in input string. <br>
[[File:FindH.JPG|650px]]
 
With the same idea, we can find 'a':<br>
[[File:FindA.JPG|650px]]
 
What's next??
 
We create an integer array. The size of this array is the same as input's size. This array is to record the characters exist in that position. If characcter exist, we mark it as specific number. Wecall this array as band.<br>
 
First, we initialize all integers as '0'. This means all characters in keyword are not found in that position in input.<br>
[[File:Initial.JPG|650px]]
 
Then, we start to find keyword's characters in input string. For example, we want to find 'h'. As 'h' is the first character in keyword, we will mark it '1' in the integer array 'band'. Like this:<br>
[[File:Mark1.JPG|650px]]
 
With the same idea, we find out the second character 'a'. As 'a' is the second character in keyword, we mark it '2'.<br>
[[File:Mark2.JPG|650px]]
 
After all characters in keyword are found, we will count how many numbers are connected in sequence.<br>
[[File:NumInSequence.JPG|650px]]
 
This is our main idea to parallel the grep function. Yes, it still have some loops. But those loops are not as large as that in serial grep function. So we start to design the code foe it.<br>
 
 
''' Parallel grep code '''
 
Here are part of parallel grep code:<br>
 
Main grep parallel:<br>
[[File:ParaGrep.JPG|650px]]
 
Initial integer arrar kernel:<br>
[[File:InitialBandFunc.JPG|650px]]
 
Grep character kernel:<br>
[[File:GrepKernel.JPG|650px]]
 
Two grep:<br>
[[File:GrepCompare.JPG|650px]]
 
Run and result:<br>
[[File:Result22.JPG|650px]]
 
Yes. The grep of parallel's result is the same as serial's result! It works!<br>
 
''' Profile '''
 
Sample of profile:<br>
[[File:ProfilePara.JPG|650px]]
 
Run and record time data for serial program and parallel program. We got the time for each:<br>
[[File:TimeCompareTwoPart.JPG|650px]]
 
Generate the graphics:<br>
Total time compare:<br>
[[File:TotalCompare.JPG|650px]]
 
Time compare for function grep() and grepPara():<br>
[[File:TimecompareFunction.JPG|650px]]
 
We got it!<br>
=== Assignment 3 ===
 
Next, we optimize the code so that it runs faster.<br>
 
''' Use shared memory '''
 
We change global memory to shared memory in kernel grepx(). The code is:<br>
[[File:SharedMem.JPG|650px]]
 
 
''' Profile '''
 
We get the data like this:<br>
[[File:SharedData.JPG|650px]]
 
 
''' ntpb ''
 
We also try different ntpb and see if it effect runtime.<br>
[[File:NtpbDiff.JPG|650px]]
 
According to the data, the number of ntpb is not an important factor affecting time.
 
''' End '''
 
That's all for our assignment.
 
 
== Resources ==
 
You can see my code here: https://gist.github.com/KignorChan/764c71d9d96561e4101d9830e0bc5e69
 
Idea from LZW compress program: https://codereview.stackexchange.com/questions/86543/simple-lzw-compression-algorithm
167
edits