Changes

Jump to: navigation, search

GPU621/MKL

265 bytes added, 15:41, 30 November 2022
How MKL Improves Efficiency
== How MKL Improves Efficiency ==
In this instance the MKL used DGEMM to improve the calculation time. DGEMM stands for '''D'''ouble-precision, '''GE'''neral '''M'''atrix-'''M'''atrix multiplication. In the example used to demonstrate matrix multiplication, the code defines the multiplication of two matrices along with scaling factors alpha and beta. It can be noted that without MKL implementation the matrix multiplication is done though nested loops, however in the MKL optimized version cblas_dgemm() is called. The dgemm refers to DGEMM defined above and cblas refers to the CBLAS interface, which stands for '''B'''asic '''L'''inear '''A'''lgebra '''S'''ubprograms in '''C'''. One part of BLAS, level 3, is dedicated to matrix-matrix operations, which in this case includes the matrix multiplication calculations. While the math and logic behind the implementation of the cblas_dgemm() function is fairly complicated, a simplified explanation on how it works can be expressed as the decomposition of either one or both of the matrices being multiplied and taking advantage of cache memory to improve computation speed. 
One part of BLAS, level 3, is dedicated to matrix-matrix operations, which in this case includes the matrix multiplication calculations. While the math and logic behind the implementation of the cblas_dgemm() function is fairly complicated, a simplified explanation on how it works can be expressed as the decomposition of either one or both of the matrices being multiplied and taking advantage of cache memory to improve computation speed. The decomposition of matrices into block matrices allows for general matrix-matrix multiplication to be conducted recursively. By using a beta parameter with the block matrices a multiplication calculation can be eliminated from each member of the resulting matrix
== Other Mathematical Functionality ==
24
edits

Navigation menu