212
edits
Changes
BetaT
,→Optimizing Problems
The original algorithm was split into 2 kernels. The first kernel causing no problems is as follows:
'''__global__ void Initalize(double* u, double* un, int nx, int nt, double dx)
{
int i = blockIdx.x * blockDim.x + threadIdx.x;
__syncthreads();
}
}'''
The second kernel works perfectly find for arguments less than 1024 1024 (user inputs 2 values), anything higher for example an argument of 2000 2000 will crash the driver and results will be set to pre kernel launch. The kernel code is below: