Changes

Skynet/GPU610

3 bytes added, 16:10, 3 December 2014

→‎Optimizations Used

'''__device__ __forceinline__''' : because the program uses various loops and recursion we force the compiler to use inline functions to speed up the trace and mix functions as well as some methods in the Vec3 and Sphere class.

'''sqrtf, tanf, fmaxf''' : where std:: was being used we replaced it with CUDA's math library equivalents although gains were marginal from this.

'''shared memory''' : we implemented shared memory but quickly realized that it was actually slower then sticking to global memory, we believe this has to do with the number of times the array has to be copied into shared memory.

**We also needed to rework a few parts of code in order to be parallelized

Bruno Pereira

1

edit

Changes

Skynet/GPU610

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools