
Jump to: navigation, search


649 bytes added, 21:08, 11 April 2017
no edit summary
== WordTranslator - Parallel Approach ==
[ Parallel Solution Branch of GitHub ]
Issues arose when attempting to change data within a kernel on device memory
1. Kernels do not accept complex objects from the host ( Maps, vectors, strings)
2. Kernels load and execute on sequential memory. Device Pointers
3. Replacing the data using a character pointer (char*) proved exceedingly difficult.
Thus the solution had to be modified in order to accommodate these issues. A couple of options were available to overcome this.
1. We will match a pattern found by the kernel
2. We will record where the result was found and the position that we found this match. This would allow another more sophisticated device function to make these translations. This function on a CPU would be at most O(n^2)
3. Instead, introduce a structure to manage our complex data. (See Below)
Instead of using a result array __ballot(PREDICATE) could be considered (More research on this to be done).
 MPI - More knowledge of Message Passing Interface might be needed for a full solution to this problem.See [ MPI Wikipedia]
=== Assignment 3 ===
Optimizations were different in nature than typical Optimizations for Kernel Launches.
'''Launch Configurations'''
First the configuration was optimized. At first threads were launched based on the length of the target text. Later, it was found more useful to launch as many threads as the device could hold for a single block.
'''Pattern Matching'''
[ Link to Wiki explanation]
'''Runtime''' [[File:Runtime_CPUvsGPU.png]]  [[File:CPUvsGPU_Timing_Matching.PNG]] '''StreamingKernel Launch''' 
Instead of making changes to increase the efficiency of the Kernel, changes were made to incorporate the spirit of the original solution. Taking in multiple words from a lexicon and making changes to a large text.
Thus , streaming the launch of this kernel with alternate can be introduce to split the target data into manageable partitions. In addition, if the patterns are the same size, we can assist the program in finding multiple words in less timeuse streaming to look for more than one word concurrently.  [[File:Streaming_Kernel.PNG]]

Navigation menu