93
edits
Changes
GPUSquad
,→Team Members
</source>
== Team Members ==
# [mailto:tsarkar3@myseneca.ca?subject=dps915 Tanvir Sarkar]TS
# [mailto:moverall@myseneca.ca?subject=dps915 Michael Overall]
# [mailto:ikrasnyanskiy@myseneca.ca?subject=gpu610 Igor Krasnyanskiy]
# [mailto:tsarkar3@myseneca.ca;moverall@myseneca.ca;ikrasnyanskiy@myseneca.ca?subject=dps915gpu610 Email All]
== Progress ==
</source>
<nowiki>****************</nowiki>
A NOTE ON SCALABILITY:
In our attempts to make the kernel scalable with ghost cells, we scaled along one dimension. However, we were inconsistent in our scaling. The 1D kernel scaled along the n (y) dimension while the 2d kernels scaled along the m (x) dimension. Scaling along the x dimension, while allowing results to be testable between serial and 2D parallelized versions of the code, produced distributions that were strangely banded and skewed. In other words, we made the code render weird things faster:
[[File:MDimensionScale.png]]
FINAL TIMINGS <pre style="color: red"> THE GRAPH IMMEDIATELY BELOW IS INCORRECT: there was an error recording the 1D runtimes for assignment 2</pre>
Note how the run times for each kernel with shared memory are significantly longer than those with global.
To demonstrate that try to determine if this is probably an issue was one of warp divergence, here is another diagram we tried to time a kernel with timings where the kernel both sets up global memory that also initialized shared memory using if statments to determine , although referenced global memory when to initialize ghost cells, but runs carrying out the Jacobi actual calculations using global memory:
[[File:GlobalInitSharedKernelTimes.png]]
Unfortunately our group's inability to effectively use profiling tools has left this discrepancy as a mystery.