Difference between revisions of "GPU621/Intel Parallel Studio Inspector."
(→Memory Analysis Options) |
|||
(36 intermediate revisions by the same user not shown) | |||
Line 6: | Line 6: | ||
= Project Overview = | = Project Overview = | ||
+ | |||
+ | |||
This project was created to provide detailed documentation on Intel Parallel Studio Inspector, how it works, and demonstrate key features that will help reduce write parallel | This project was created to provide detailed documentation on Intel Parallel Studio Inspector, how it works, and demonstrate key features that will help reduce write parallel | ||
− | = Intel Parallel Studio Inspector = | + | = '''Intel Parallel Studio Inspector''' = |
− | Intel Parallel Studio Inspector is a dynamic tool that helps users detect memory and threading errors in their serial and multithreaded applications. Intel Inspector is available on Windows and Linux operating systems and works with C, C++, C#, and Fortran programming languages. For this project, we will be using Visual Studio 2019 with Inspector. | + | |
+ | Intel Parallel Studio Inspector is a dynamic tool that helps users detect memory and threading errors in their serial and multithreaded applications. Intel Inspector is available on Windows and Linux operating systems and works with C, C++, C#, and Fortran programming languages. For this project, we will be using Visual Studio 2019 with Intel Inspector. | ||
+ | |||
+ | === The Threading Debugger provides support for the following parallelization models: === | ||
+ | |||
+ | *OpenMP | ||
+ | *Threading Building Blocks (TBB) | ||
+ | *Parallel language extensions for wIntel C++ Compiler | ||
+ | *Microsoft PPL | ||
+ | *IWin32 and POSIX threads | ||
+ | *Intel MPI | ||
+ | |||
+ | === Supported Development environments: === | ||
+ | |||
+ | *Microsoft Visual Studio | ||
+ | *Eclipse | ||
+ | *Stand-alone applications | ||
+ | *Command line | ||
+ | |||
+ | === Supported Compilers: === | ||
+ | |||
+ | *Intel® C++ and Intel® Fortran Compilers | ||
+ | *Microsoft Visual C++* compiler | ||
+ | *GNU Compiler Collection (GCC)* | ||
+ | |||
+ | |||
+ | [[File:WorkFlow.png|500px]] | ||
+ | [https://hpc.llnl.gov/software/development-environment-software/intel-inspector Intel Inspector Work Flow] | ||
+ | |||
+ | |||
+ | Visit this site for more information on how to install Intel Inspector: https://www.intel.com/content/www/us/en/developer/tools/oneapi/inspector.html#gs.i68l9p | ||
+ | |||
+ | = How to use Intel Inspector = | ||
+ | |||
+ | == Finding Memory Issues == | ||
+ | |||
+ | Proper memory management is a common issue within programs, and it can be especially difficult to track. Intel Inspector can locate different memory issues such as memory leaks, memory corruption, allocation / de-allocation API mismatches, inconsistent memory API usage, illegal memory access, and uninitialized memory reads. | ||
+ | |||
+ | Once Intel Inspector is installed and you have a program you wish to analyze, navigate to: | ||
+ | |||
+ | Tools > Intel Inspector > New Analysis | ||
+ | |||
+ | |||
+ | [[File:New_Analysis.jpg|950px]] | ||
+ | |||
+ | == Levels of Memory Analysis: Speed vs Thoroughness == | ||
+ | |||
+ | An Intel Inspector window will pop up, with different features. At the top, Intel Inspector lists 3 different types of analysis levels that can be selected. | ||
+ | |||
+ | |||
+ | [[File:Analyszation_Levels.jpg|950px]] | ||
+ | |||
+ | |||
+ | ==='''Detect Leaks''' === | ||
+ | |||
+ | The least thorough and intensive analysis. This setting reduces the stress on the system and cuts the resources and time to perform the analysis. As a result, this will produce a much faster analysis but will find a limited set of errors and provide fewer details | ||
+ | |||
+ | ==='''Detect Memory Problems''' === | ||
+ | |||
+ | This setting indicates a medium-scope memory error analysis. It will increase time, resources, and load on the system when performing the analysis. This is a deeper level of analysis to find memory issues but is slightly slower. | ||
+ | |||
+ | ==='''Locate Memory Problems''' === | ||
+ | |||
+ | This maximizes the scope of the memory analysis. Using this setting will also maximize the time, resources, and load on the system to perform the analysis. This will detect an extensive range of memory issues, display the context of the problem and the highest degree of information available. | ||
+ | |||
+ | = Memory Analysis Options = | ||
+ | |||
+ | Below are explanations of some important memory analysis options that a user may consider selecting: | ||
+ | |||
+ | |||
+ | [[File:MemoryAnalysisOptions.jpg|850px]] | ||
+ | |||
+ | |||
+ | '''Detect resource leaks''' – detect if a GDI object is not deleted, or a kernel object is not closed. Useful for Windows GUI Applications | ||
+ | |||
+ | '''Enable interactive memory growth''' – detect if a region of memory has been allocated, but not deallocated during a certain time of the program’s execution. | ||
+ | |||
+ | '''Enable on-demand memory leak detection''' – detect if a region of memory has been allocated, but not deallocated during a certain time of the program’s execution and is not reachable. (No pointer to the memory location still exists) | ||
+ | |||
+ | '''Remove Duplicates''' – When this setting is on, Intel Inspector will not display all incidents of detection in the Code Location | ||
+ | |||
+ | '''Stack frame depth''' – Sets the amount of context from the stack. Powerful setting for intricate object-oriented applications. | ||
+ | |||
+ | = Example: Finding a Memory Leak = | ||
+ | |||
+ | The code below displays an obvious memory leak. The allocated memory for the integers are never deallocated: | ||
+ | |||
+ | |||
+ | [[File:MemoryLeakCode.jpg|950px]] | ||
+ | |||
+ | |||
+ | '''Step 1''': Navigate to Tools > Intel Inspector > New Analysis | ||
+ | |||
+ | |||
+ | [[File:New_Analysis.jpg|950px]] | ||
+ | |||
+ | |||
+ | '''Step 2''': Select the level of analysis you want Intel Inspector to perform and the options you want to be included. | ||
+ | |||
+ | |||
+ | [[File:Analyszation_Levels.jpg|950px]] | ||
+ | |||
+ | |||
+ | '''Step 3''': To start a memory analysis, the left-field must be set as “Memory Error Analysis”. After the desired options are checked, the user can click the “start” button located on the right-hand side to begin the analysis. | ||
+ | |||
+ | |||
+ | [[File:Start_Analysis.jpg|950px]] | ||
+ | |||
+ | |||
+ | '''Step 4''': Once the analysis is complete, Intel Inspector will show the memory leak, and the location it occurs within the code. Using these steps, a user can easily find all kinds of memory issues within their code and fix them. | ||
+ | |||
+ | |||
+ | [[File:MemoryLeak_Analysis.jpg|950px]] | ||
+ | |||
+ | |||
+ | = Finding Nondeterministic Threading Errors = | ||
+ | |||
+ | Intel Parallel Studio can help users detect a variety of threading errors such as data race conditions, deadlocks, lock hierarchy violations, and cross-thread stack access errors. These threading errors are usually non-deterministic and can be hard to reproduce. | ||
+ | |||
+ | == Levels of Threading Error Analysis: Speed vs Thoroughness == | ||
+ | |||
+ | An Intel Inspector window will pop up, with different features. At the top, Intel Inspector lists 3 different types of analysis levels that can be selected. | ||
+ | |||
+ | |||
+ | [[File:NewThreadedAnalysis.jpg|1000px]] | ||
+ | |||
+ | |||
+ | ==='''Detect Deadlocks''' === | ||
+ | |||
+ | The least thorough and intensive threading analysis. This setting reduces the stress on the system and cuts the resources and time to perform the analysis. As a result, this will produce a much faster analysis but will find a limited set of errors and provide fewer details. This setting will not focus on finding data race issues. | ||
+ | |||
+ | ==='''Detect Deadlocks and Data Races''' === | ||
+ | |||
+ | This setting indicates a medium-scope threading error analysis. It will increase time, resources, and load on the system when performing the analysis. This is a deeper level of analysis to find memory issues but is slightly slower. This setting will include information on data race issues. | ||
+ | |||
+ | ==='''Detect Deadlocks and Data Races''' === | ||
+ | |||
+ | This maximizes the scope of the threading analysis. Using this setting will also maximize the time, resources, and load on the system to perform the analysis. This will detect an extensive range of threading issues including deadlocks and data races. Intel Inspector will display the context of the problem and the highest degree of information available. | ||
+ | |||
+ | == Deadlocks == | ||
+ | |||
+ | Deadlocks can potentially happen when dealing with multi-threads. When 2 threads or more are stuck waiting for each other and trying to access the recourse but are being locked by the previous threads. In case of a deadlock, the program may run fine on the first try but the lock will eventually come up and crash the program. The following program will demonstrate deadlock occurring. The code involves resources protected by mutex locks. Their orders are m1->m2 or m2->m1. In some cases, 2 threads may cause a deadlock when they are waiting for a mutex owned by the other. | ||
+ | |||
+ | |||
+ | [[File:Deadlock_Code.jpg|600px]] | ||
+ | |||
+ | [[File:Deadlock_Output.jpg|600px]] | ||
+ | |||
+ | ''Deadlock occurs and the program doesn’t end'' | ||
+ | |||
+ | Using Intel Inspector, we can also detect deadlocks in the program | ||
+ | |||
+ | |||
+ | [[File:Deadlock_Inspector.jpg|750px]] | ||
+ | |||
+ | ''Points to mutex.lock function'' | ||
+ | |||
+ | [[File:Deadlock_Analysis2.jpg|850px]] | ||
+ | |||
+ | ''Indicating where it occurred in the code'' | ||
+ | |||
+ | The intel inspector allows us to quickly locate where the error occurs and we can move on to debug the program and find the optimal solution. | ||
+ | |||
+ | == Race Conditions == | ||
+ | |||
+ | Intel Inspector can be used to detect race conditions in programs. The following code is an example of race conditions occurring. In it, 5 threads are created to increment the value of an object several times. A race condition occurs when the threads race for the same data, therefore the value will be inconsistent and output different results. Normally in a data race, it becomes hard to locate data manually but with the help of Intel Inspector, it is much faster. | ||
+ | |||
+ | |||
+ | [[File:RaceConditions.jpg|400px]] | ||
+ | |||
+ | [[File:RaceCondition_Output.jpg|500px]] | ||
+ | |||
+ | ''Values below 20000 are caused by race conditions'' | ||
+ | |||
+ | Using Intel Inspector we can see where the race condition occurs in the code | ||
+ | |||
+ | |||
+ | [[File:RaceCondition_Inspector.jpg|800px]] | ||
+ | |||
+ | ''Indicating where the data race occurs'' | ||
+ | |||
+ | = Resources = | ||
+ | |||
+ | https://www.adeptscience.co.uk/products/cpp/intel-inspector-xe | ||
+ | |||
+ | https://www.intel.com/content/www/us/en/developer/tools/oneapi/inspector.html#gs.hfc3s3 | ||
+ | |||
+ | https://jp.xlsoft.com/documents/intel/inspector/analyzing_threadingerrorF_w.pdf | ||
− | + | https://software.intel.com/content/www/us/en/develop/videos/intel-inspector-xe-memory-and-thread-correctness-tool-overview.html |
Latest revision as of 09:15, 9 December 2021
Group Members
Project Overview
This project was created to provide detailed documentation on Intel Parallel Studio Inspector, how it works, and demonstrate key features that will help reduce write parallel
Intel Parallel Studio Inspector
Intel Parallel Studio Inspector is a dynamic tool that helps users detect memory and threading errors in their serial and multithreaded applications. Intel Inspector is available on Windows and Linux operating systems and works with C, C++, C#, and Fortran programming languages. For this project, we will be using Visual Studio 2019 with Intel Inspector.
The Threading Debugger provides support for the following parallelization models:
- OpenMP
- Threading Building Blocks (TBB)
- Parallel language extensions for wIntel C++ Compiler
- Microsoft PPL
- IWin32 and POSIX threads
- Intel MPI
Supported Development environments:
- Microsoft Visual Studio
- Eclipse
- Stand-alone applications
- Command line
Supported Compilers:
- Intel® C++ and Intel® Fortran Compilers
- Microsoft Visual C++* compiler
- GNU Compiler Collection (GCC)*
Visit this site for more information on how to install Intel Inspector: https://www.intel.com/content/www/us/en/developer/tools/oneapi/inspector.html#gs.i68l9p
How to use Intel Inspector
Finding Memory Issues
Proper memory management is a common issue within programs, and it can be especially difficult to track. Intel Inspector can locate different memory issues such as memory leaks, memory corruption, allocation / de-allocation API mismatches, inconsistent memory API usage, illegal memory access, and uninitialized memory reads.
Once Intel Inspector is installed and you have a program you wish to analyze, navigate to:
Tools > Intel Inspector > New Analysis
Levels of Memory Analysis: Speed vs Thoroughness
An Intel Inspector window will pop up, with different features. At the top, Intel Inspector lists 3 different types of analysis levels that can be selected.
Detect Leaks
The least thorough and intensive analysis. This setting reduces the stress on the system and cuts the resources and time to perform the analysis. As a result, this will produce a much faster analysis but will find a limited set of errors and provide fewer details
Detect Memory Problems
This setting indicates a medium-scope memory error analysis. It will increase time, resources, and load on the system when performing the analysis. This is a deeper level of analysis to find memory issues but is slightly slower.
Locate Memory Problems
This maximizes the scope of the memory analysis. Using this setting will also maximize the time, resources, and load on the system to perform the analysis. This will detect an extensive range of memory issues, display the context of the problem and the highest degree of information available.
Memory Analysis Options
Below are explanations of some important memory analysis options that a user may consider selecting:
Detect resource leaks – detect if a GDI object is not deleted, or a kernel object is not closed. Useful for Windows GUI Applications
Enable interactive memory growth – detect if a region of memory has been allocated, but not deallocated during a certain time of the program’s execution.
Enable on-demand memory leak detection – detect if a region of memory has been allocated, but not deallocated during a certain time of the program’s execution and is not reachable. (No pointer to the memory location still exists)
Remove Duplicates – When this setting is on, Intel Inspector will not display all incidents of detection in the Code Location
Stack frame depth – Sets the amount of context from the stack. Powerful setting for intricate object-oriented applications.
Example: Finding a Memory Leak
The code below displays an obvious memory leak. The allocated memory for the integers are never deallocated:
Step 1: Navigate to Tools > Intel Inspector > New Analysis
Step 2: Select the level of analysis you want Intel Inspector to perform and the options you want to be included.
Step 3: To start a memory analysis, the left-field must be set as “Memory Error Analysis”. After the desired options are checked, the user can click the “start” button located on the right-hand side to begin the analysis.
Step 4: Once the analysis is complete, Intel Inspector will show the memory leak, and the location it occurs within the code. Using these steps, a user can easily find all kinds of memory issues within their code and fix them.
Finding Nondeterministic Threading Errors
Intel Parallel Studio can help users detect a variety of threading errors such as data race conditions, deadlocks, lock hierarchy violations, and cross-thread stack access errors. These threading errors are usually non-deterministic and can be hard to reproduce.
Levels of Threading Error Analysis: Speed vs Thoroughness
An Intel Inspector window will pop up, with different features. At the top, Intel Inspector lists 3 different types of analysis levels that can be selected.
Detect Deadlocks
The least thorough and intensive threading analysis. This setting reduces the stress on the system and cuts the resources and time to perform the analysis. As a result, this will produce a much faster analysis but will find a limited set of errors and provide fewer details. This setting will not focus on finding data race issues.
Detect Deadlocks and Data Races
This setting indicates a medium-scope threading error analysis. It will increase time, resources, and load on the system when performing the analysis. This is a deeper level of analysis to find memory issues but is slightly slower. This setting will include information on data race issues.
Detect Deadlocks and Data Races
This maximizes the scope of the threading analysis. Using this setting will also maximize the time, resources, and load on the system to perform the analysis. This will detect an extensive range of threading issues including deadlocks and data races. Intel Inspector will display the context of the problem and the highest degree of information available.
Deadlocks
Deadlocks can potentially happen when dealing with multi-threads. When 2 threads or more are stuck waiting for each other and trying to access the recourse but are being locked by the previous threads. In case of a deadlock, the program may run fine on the first try but the lock will eventually come up and crash the program. The following program will demonstrate deadlock occurring. The code involves resources protected by mutex locks. Their orders are m1->m2 or m2->m1. In some cases, 2 threads may cause a deadlock when they are waiting for a mutex owned by the other.
Deadlock occurs and the program doesn’t end
Using Intel Inspector, we can also detect deadlocks in the program
Points to mutex.lock function
Indicating where it occurred in the code
The intel inspector allows us to quickly locate where the error occurs and we can move on to debug the program and find the optimal solution.
Race Conditions
Intel Inspector can be used to detect race conditions in programs. The following code is an example of race conditions occurring. In it, 5 threads are created to increment the value of an object several times. A race condition occurs when the threads race for the same data, therefore the value will be inconsistent and output different results. Normally in a data race, it becomes hard to locate data manually but with the help of Intel Inspector, it is much faster.
Values below 20000 are caused by race conditions
Using Intel Inspector we can see where the race condition occurs in the code
Indicating where the data race occurs