Open main menu

CDOT Wiki β

Changes

Kernal Blas

261 bytes added, 10:08, 4 April 2018
Assignment 3
'''Parallelizing
We From one of the suggested improvements in the algorithm post link. A potential improvement is changing from char& c to a const char in the for loop <syntaxhighlight lang="cpp">  for (char& c : input) { </syntaxhighlight > since char& c is not being modified. Otherwise we did not see any other way to parallelize the algorithmcompression.
=== Assignment 2 ===
[[File:Prof.PNG]] <br>
Profiling the code shows that '''cudaMallocmemcpy''' takes up most of the time spent. Even when <br>
there are 10 iterations, the time remains at 300 milliseconds. <br>
As the iteration passes 25 million, we have a bit of memory leak which results in inaccurate results. <br><br>
96
edits