Difference between revisions of "GPU621/Code"

From CDOT Wiki
Jump to: navigation, search
(Intel VTune™ Amplifier)
(Intel VTune™ Amplifier)
Line 144: Line 144:
 
To expand more options for running VTune.
 
To expand more options for running VTune.
  
[[File:SetupOptions.PNG | 400px]]
+
[[File:SetupOptions.PNG | 600px]]
  
 
This menu will appear, it contains different tests that you can run against your program. The I am going to look through is the default test Hotspots. Depending on your program you may want too to look into the other options.
 
This menu will appear, it contains different tests that you can run against your program. The I am going to look through is the default test Hotspots. Depending on your program you may want too to look into the other options.
  
[[File:startup options.PNG | 400px]]
+
[[File:startup options.PNG | 600px]]
  
 
==='''How it works'''===
 
==='''How it works'''===
 
On the main page we can run the test, by clicking on the blue play button.
 
On the main page we can run the test, by clicking on the blue play button.
  
[[File:start test.PNG | 400px]]
+
[[File:start test.PNG | 600px]]
  
 
When the test completes, the summary page will be displayed. This will outline the results from the test.  
 
When the test completes, the summary page will be displayed. This will outline the results from the test.  
  
[[File:test complete.PNG | 400px]]
+
[[File:test complete.PNG | 600px]]
  
 
==='''Interpreting results'''===
 
==='''Interpreting results'''===
 
The following picture is the different tabs available from a hotspot analysis.
 
The following picture is the different tabs available from a hotspot analysis.
  
[[File:tabs.PNG | 400px]]
+
[[File:tabs.PNG | 600px]]
  
Analysis configuration
+
*Analysis configuration
o Main configuration page for VTune
+
**Main configuration page for VTune
Collection log
+
*Collection log
o Logs from the analysis
+
**Logs from the analysis
Summary
+
*Summary
o Elapsed Time: this is the amount of time your program took to run
+
**Elapsed Time: this is the amount of time your program took to run
The CPU time: displays the effective, spin and overhead times.
+
***The CPU time: displays the effective, spin and overhead times.
o Top Hotspots: Displays the area’s that were most active in your program.
+
**Top Hotspots: Displays the area’s that were most active in your program.
o Effective CPU Utilization Histogram: This shows the time your program spent using x number of threads. The graph shows x axis is the moments that your program was a certain number of threads. And the y axis is the time that your program used that number of threads for.
+
**Effective CPU Utilization Histogram: This shows the time your program spent using x number of threads. The graph shows x axis is the moments that your program was a certain number of threads. And the y axis is the time that your program used that number of threads for.
o Collection and Platform Info: this display’s all the hardware information about the computer the test was run on.
+
**Collection and Platform Info: this display’s all the hardware information about the computer the test was run on.
Bottom-up
+
*Bottom-up
o Allows you to se the call stack of a function starting from the first call.
+
**Allows you to se the call stack of a function starting from the first call.
Caller/Callee
+
*Caller/Callee
o Allows you to see details on each function and see callers and callees for each function  
+
**Allows you to see details on each function and see callers and callees for each function  
Top-down tree
+
*Top-down tree
o Shows the call stack of the program as a tree starting from the top.
+
**Shows the call stack of the program as a tree starting from the top.
Platform
+
*Platform
o Displays the time and the utilization of each thread.
+
**Displays the time and the utilization of each thread.

Revision as of 08:59, 26 November 2018

Debugging Threads in Intel Parallel Studio

Group Members

  1. Corey James
  2. Guozhao Liang
  3. Oleksii Kozachenko
  4. eMail All

Intel® Parallel Studio XE

Intel Parallel Studio XE is a software development tool suite for compiling applications and optimizing performance with less effort.

The Intel C++ Compiler is not the only tool that comes with IPS XE 2019. It also includes next applications:

File:XEVTuneLogo.jpg

  • Intel® Advisor
  • Intel® Inspector
  • Intel® VTune™ Amplifier

Let's take a quick look at each of them:

Intel® Advisor

Vectorization optimization and thread prototyping.

Use this tool in the vectorization and threading stages of the flow.

Intel® Inspector

File:XEInspectorLogo.jpg

Memory and thread debugger.

Use this tool to find races, deadlocks, and illegal memory accesses.

  • Locate root cause errors early―before you release
  • Quickly debug intermittent races and deadlocks

Intel® VTune™ Amplifier

Performance profiler.

Use this tool in the threading and bandwidth optimization stages and for advanced vectorization optimization.

  • Save money: Locate root cause errors early―before you release
  • Save time: Quickly debug intermittent races and deadlocks


Intel Inspector

Intel Inspector is a dynamic memory and threading error checking instrument to inspect serial and multi-threaded programs.

Intel Inspector comes with Intel Parallel Studio XE along with two other debugging tools - VTune and Advisor.

Create a project

There are 2 ways to work with inspector.

  • Run inspector directly from Visual Studio

This is the easiest and fastest way that requires no additional configurations.

Inspector VS.PNG


  • Run as a separate program.

Working with Intel Inspector application requires passing it a compiled version of your program. Additionally you may need to link some libraries (lib, dll, etc).

Inspector app.PNG


Configure a project

Intel suggests using small data set sizes and load threads with small chunks of work..

This will reduce run time and the speed of the analysis.


Choose analysis type

Inspector allows you to choose between predefined types of analysis.

Inspector 01 init window.PNG

  • Memory error analysis
    • Detect leaks
    • Detect Memory problems
    • Locate memory problems


  • Threading error analysis
    • Detect deadlocks
    • Detect deadlocks and data races
    • Locate deadlocks and data races


  • Custom analysis types - users can create their own types based on selected preset type.


How it works

Inspector performs the analysis in multiple steps:

1. The program is executed

2. It identifies problems that may need to be resolved.

3. Gathers problems.

4. Converts symbol information into filenames and line numbers.

5. Applies suppression rules.

6. Removes duplicates.

7. Creates problem sets.

8. Opens a debugging session.


Interpreting results

Memory leak problem : https://www.codeproject.com/Tips/1184749/Allocating-Memory-in-C-Cplusplus-How-to-Avoid-Memo

Intel VTune™ Amplifier

Introduction

Intel VTune amplifier is a analysis software that allows you the ability to measure performance of your serial or multithreaded program. VTune allows you to analyze the performance of your algorithms and multithreading. It can help with debugging threads by calculating overhead, finding bottlenecks or inefficiencies.

Create a project

I will be explaining how to use VTune part Intel® Parallel Studio XE 2019 of the alongside Visual Studio 2017. Intel® Parallel Studio XE can be found here: https://software.intel.com/en-us/parallel-studio-xe. Once installed you will be able to find VTune in the tools tab inside Visual Studio.

Tools tab.PNG

Configure a project

When you hover over Intel VTune Amplifier 2019 in the tool’s menu. You will see more options appear. Select configure Analysis.

Options.PNG

The following screen will be displayed. Click on the three little dots circled in the below picture. To expand more options for running VTune.

SetupOptions.PNG

This menu will appear, it contains different tests that you can run against your program. The I am going to look through is the default test Hotspots. Depending on your program you may want too to look into the other options.

Startup options.PNG

How it works

On the main page we can run the test, by clicking on the blue play button.

Start test.PNG

When the test completes, the summary page will be displayed. This will outline the results from the test.

Test complete.PNG

Interpreting results

The following picture is the different tabs available from a hotspot analysis.

Tabs.PNG

  • Analysis configuration
    • Main configuration page for VTune
  • Collection log
    • Logs from the analysis
  • Summary
    • Elapsed Time: this is the amount of time your program took to run
      • The CPU time: displays the effective, spin and overhead times.
    • Top Hotspots: Displays the area’s that were most active in your program.
    • Effective CPU Utilization Histogram: This shows the time your program spent using x number of threads. The graph shows x axis is the moments that your program was a certain number of threads. And the y axis is the time that your program used that number of threads for.
    • Collection and Platform Info: this display’s all the hardware information about the computer the test was run on.
  • Bottom-up
    • Allows you to se the call stack of a function starting from the first call.
  • Caller/Callee
    • Allows you to see details on each function and see callers and callees for each function
  • Top-down tree
    • Shows the call stack of the program as a tree starting from the top.
  • Platform
    • Displays the time and the utilization of each thread.