Open main menu

CDOT Wiki β

Changes

GPU621/Intel oneMKL - Math Kernel Library

1,710 bytes added, 01:32, 1 December 2021
no edit summary
Finally, modify the additional dependencies with the help of the URL https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html<br />
==MKL Testing==
In this project I want to compare the running time of the serial version and the optimized version of MKL under multithreading. <br />
 
serial version<br />
 
clock_t startTime = clock();
for (r = 0; r < LOOP_COUNT; r++) {
for (i = 0; i < m; i++) {
for (j = 0; j < n; j++) {
sum = 0.0;
for (k = 0; k < p; k++)
sum += A[p * i + k] * B[n * k + j];
C[n * i + j] = sum;
}
}
}
 
clock_t endTime = clock();
<br />
 
MKL version<br />
 
max_threads = mkl_get_max_threads();
printf(" Finding max number %d of threads Intel(R) MKL can use for parallel runs \n\n", max_threads);
 
printf(" Running Intel(R) MKL from 1 to %i threads \n\n", max_threads * 2);
for (i = 1; i <= max_threads * 2; i++) {
for (j = 0; j < (m * n); j++)
C[j] = 0.0;
 
mkl_set_num_threads(i);
 
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
m, n, p, alpha, A, p, B, n, beta, C, n);
 
s_initial = dsecnd();
for (r = 0; r < LOOP_COUNT; r++) {
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
m, n, p, alpha, A, p, B, n, beta, C, n);
}
s_elapsed = (dsecnd() - s_initial) / LOOP_COUNT;
 
https://raw.githubusercontent.com/MenglinWu9527/m3u/main/Snipaste_2021-12-01_00-20-37.jpeg
 
{| class="output"
! serial
! 1
! 2
! 3
! 4
! 5
! 6
! 7
! 8
! 9
! 10
! 11
! 12
|-
| 1500
| 15.7
| 7.7
| 6.4
| 8.1
| 7.4
| 7.5
| 8.0
| 7.9
| 7.2
| 7.5
| 7.2
| 8.0
| 8.0
|}
 
Here is my computer's number of logical processors.</br>
wmic:root\cli>cpu get numberoflogicalprocessors</br>
NumberOfLogicalProcessors
6
==References==
references
37
edits