Open main menu

CDOT Wiki β

Changes

BetaT

98 bytes added, 23:45, 11 April 2017
no edit summary
xxxxx xxxxx
Upon initialization the 1st column of the first array gets has its variables setdepending on a condition, this will be represented by (o)'sbelow
Array 1 Array 2
The next kernel below will execute the following calucations.
1st: Array 2 will copy the first column of Array 1..This will be represented by (o)'s on Array2
Array 1 Array 2
oxxxx oxxxx
2nd: Array 1 will set the values in its [0,1] dimension->(marked by 2) index to the values in Array 2's [1,0] dimension (marked by a 2)index.
Array 1 Array 2
With this optimized code it is now possible to execute with a problem size > 2000 & 2000.
== FIRST OPTIMIZATION & Execution Comparison Times== If you have not, please take a look at section 3.1.1.1(just above), as it shows how the first iteration of optimization The Initialize kernel has also been delivered. Below is a comparison of times from the original CPU to the newly optimized kernel execution. These comaprison times are for the WHOLE execution of the program, not just parts. These include memory transfers, allocation, de-allocation and calculations.  TIMES ARE IN MILLISECONDS  N Linux Visual No Parallel Parallized Optimized_A (2000 ^ 2) 1160 | 20520 | 6749 | 971 (5000 ^ 2) 28787 | 127373 | n/a | 1417 (10000 ^ 2) 124179 | 522576 | n/a | 3054 == SECOND OPTIMIZATION == The original Initialize Kernel needed some alteringredesigned. Below is the original:
__global__ void Initalize(float* u, float* un, int nx, int nt, float dx)
}
}
 
== FIRST OPTIMIZATION & Execution Comparison Times==
 
If you have not, please take a look at section 3.1.1.1(just above), as it shows how the first iteration of optimization has been delivered.
 
Below is a comparison of times from the original CPU to the newly optimized kernel execution. These comaprison times are for the WHOLE execution of the program, not just parts. These include memory transfers, allocation, de-allocation and calculations.
 
TIMES ARE IN MILLISECONDS
 
N Linux Visual No Parallel Parallized Optimized_A
(2000 ^ 2) 1160 | 20520 | 6749 | 971
(5000 ^ 2) 28787 | 127373 | n/a | 1417
(10000 ^ 2) 124179 | 522576 | n/a | 3054
 
== SECOND OPTIMIZATION ==
212
edits