36
edits
Changes
no edit summary
== History and Current State of CUDA ==
[[File:Arch_NVIDIA.jpg]]
NVIDIA’s first CUDA architecture was called Fermi.
“NVIDIA’s Fermi GPU architecture consists of multiple streaming multiprocessors (SMs), each consisting of 32 cores, each of which can execute one floatingpoint or integer instruction per clock. The SMs are supported by a second-level cache, host interface, GigaThread scheduler, and multiple DRAM interfaces.”
== Software Programming Frameworks ==
Software Programming Frameworks are structures in which libraries reside. APIs pull source code from frameworks.
OpenMP API
oneAPI Thread Building Blocks
CUDA API – extension of C and C++ programming language allowing for thread level parallelism. It is a high-level abstraction allowing for developers to easily maximize the potential of CUDA enabled devices. Through the CUDA API, parallel blocks of code are identified as kernels. A kernel will get executed in parallel by CUDA threads. CUDA threads are collected into blocks, and blocks are organized into grids.
[[File:CUDA_code.png]]
CUDA special syntax parameters <<<…>>> identified in the kernel invocation execution configuration are number of blocks (1) and number of threads per block (N).