Changes

Jump to: navigation, search

DPS921/Intel Advisor

1,409 bytes added, 18:57, 7 December 2020
Roof-line Analysis
= Roof-line Analysis =
The roofline tool creates a tool line model, to represent an application's performance in relation to hardware limitations, including memory bandwidth and computational peaks. To measure performance we use 2 axes with GFLOPs (Giga Flops/secFloating point operations per second) on the y-axis, and AI(Arithmetic Intensity(FLOPs/Byte)) on the x-axis both in log scale, with this we can begin to build our roof-line. Now for any given machine, its CPU can only perform so many FLOPs so we can plot the CPU cap on our chart to represent this. Like the CPU a memory system can only supply so many gigabytes, we can represent this by a diagonal line(N GB/s * X FLOPs/Byte = Y GFLOPs/s). (pic) This chart represents the machine's hardware limitation, and it's best performance at a given AI
Every function, or loop, will have specific AI, when ran we can record its GFLOPs , Because we know Its AI won't change and any optimization we do will only change the performance, this is useful when we want to measure the performance of a given change or optimization.
= Memory Access Pattern Analysis =
We can use the MAP analysis tool to check for various memory issues, such as non-contiguous memory accesses and unit strides. Also we can get information about types of memory access in selected loops/functions, how you traverse your data, and how it affects your vector efficiency and cache bandwidth usage.<source>#include <iostream>using namespace std;= How to set up Memory Access Pattern Analysis =step 1 run roof-line tools [[File:Step_4.PNG]]
const long int SIZE = 3500000;Step 2 run Map tool
typedef struct tricky{ int member1; float member2;} tricky;[[File:Step_5.PNG]]
tricky structArray[SIZE];Step 3 Review data
int main(){ cout << "Starting[[File:Step_6.\n"; for (long int i = 0; i < SIZE; i++) { structArray[iPNG]].member1 = (i / 25) + i - 78; } cout << "Done.\n"; return EXIT_SUCCESS;}
<source>/* Copyright (C) 2010-2017 Intel Corporation. All Rights Reserved.
*
* The source code, information and material ("Material")
* contained herein is owned by Intel Corporation or its
* suppliers or licensors, and title to such Material remains
* with Intel Corporation or its suppliers or licensors.
* The Material contains proprietary information of Intel or
* its suppliers and licensors. The Material is protected by
* worldwide copyright laws and treaty provisions.
* No part of the Material may be used, copied, reproduced,
* modified, published, uploaded, posted, transmitted, distributed
* or disclosed in any way without Intel's prior express written
* permission. No license under any patent, copyright or other
* intellectual property rights in the Material is granted to or
* conferred upon you, either expressly, by implication, inducement,
* estoppel or otherwise. Any license under such intellectual
* property rights must be express and approved by Intel in writing.
* Third Party trademarks are the property of their respective owners.
* Unless otherwise agreed by Intel in writing, you may not remove
* or alter this notice or any other notice embedded in Materials
* by Intel or Intel's suppliers or licensors in any way.
* This file is intended for use with the "Memory Access 101" tutorial.
*/
#include <iostream>
#include <time.h>
return EXIT_SUCCESS;
}</source> 
</source>
= Sources =
62
edits

Navigation menu