GPU621/Intel Inspector
Intel Parallel Studio Inspector
Project Overview
Intel Inspector is a dynamic memory and threading error debugger which able to detect and locate memory leaks, deadlocks, and race conditions. The purpose of this project is to introduce Intel Inspector and demonstrate how to use Inspector to debug our code.
Group Members
Features
The purpose of the Intel inspector is to help us find difficult and non-deterministic errors in large programs. As the program gets bigger and has complicated logic it is difficult to find memory leaks and threading errors. Some of it's main features are
- Locate Nondeterministic Threading Errors
Threading errors are usually nondeterministic and difficult to reproduce. Intel Inspector helps detect and locate them, including data race conditions (heap and stack races), deadlocks, lock hierarchy violations, and then cross-thread stack access errors.
- Detect Hard-to-Find Memory Errors
Memory errors can be difficult to find, such as memory leaks, corruption, mismatched allocation and deallocation API, inconsistent use of memory API, illegal memory access, and uninitialized memory read. Intel Inspector finds these errors and integrates with a debugger to identify the associated issues. It also diagnoses memory growth and locates the call stack causing it.
- Simplify the Diagnosis of Difficult Errors
Debugger breakpoints diagnose errors by breaking into the debugger just before the error occurs. When debugging outside of Intel Inspector, a breakpoint stops execution at the right location. The problem with this is that the location might be executed thousands of times before the error occurs. By combining debug with analysis, Intel Inspector determines when a problem occurs and breaks into the debugger at the right time and location.
- Find Persistence Memory Errors
Intel® Optane™ DC persistent memory is a new memory technology with high-capacity persistent memory for the data center. It maintains data even when the power is shut off, but this data must first be properly flushed out of volatile cache memory. Persistence Inspector helps find possible persistent memory errors so that the system operates correctly when the power is restored.
It detects:
Missing or redundant cache flushes Missing store fences Out-of-order persistent memory stores Incorrect undo logging for the
Software supported
Intel Inspector supports various languages (C, C++, and Fortran), operating systems (Windows and Linux), IDEs (Visual Studio, Eclipse, etc.), and compilers (Intel C++, Intel Fortran, Visual C++, GCC, etc.). It also supports OpenMP, TBB, Parallel language extensions for the Intel C++ Compiler, Microsoft PPL, Win32 and POSIX threads, Intel MPI Library.
Tutorial
We will analyze a simple code with memory leak to demonstrate the steps to use Intel Inspector's Analysis
Step 1: Write your code which is to be analyzed and build the project
Step 2: Go to tools > Intel Inspector > Memory Error Analysis (there are several other options which can be used as per requirements)
Analysis Panel Details
On-demand Memory Analysis
Intel Inspector customarily displays memory leaks at the end of an analysis run when an application exits; however, you can also use the Intel Inspector on-demand memory leak detection feature to gather memory leak information while an application is running. This is useful if:
- An application does not terminate (such as a server process).
- You want memory leak information, but you do not want to wait for an application to terminate.
- You want to determine if memory is leaked during a specific interval of application execution, or during a specific user action.
Invalid Memory Access
Here we are accessing c[1] which is already deleted. Upon analyzing this code we get invalid memory access error along with the line number where the invalid access occurs. There are screenshots of this analysis after this code.
#include<iostream>
int main()
{
char* c;
c = new char[100];//requests heap memory which will not be freed
for (int i = 0;i < 100;i++) {
c[i] = 'a';
}
std::cout << c[10] << std::endl;
delete[] c;
c[1] = 'a';
}
Walkthrough
Memory Leak
Memory Leak
This program is written in C++, it will allocate an array of integer pointers then terminated.
int main()
{
int* myInts = new int[5] ;
return 0 ;
}
The result shows there is memory leak on line 4.
The Inspector User Guide has provided the following solutions.
Mismatched Deallocation and Missing Allocation
Try to deallocate the memory by using two delete keyword at the same time.
int main()
{
int* myInts = new int[5] ;
delete myInts ; // deallocates one object
delete[] myInts ; // deallocate an array of object
return 0 ;
}
If there is type mismatched deallocation, Inspector will mark down the allocation and mismatched deallocation line.
There is missing allocation occurred on line 7 because we have deallocated one object in the previous line.
The Inspector User Guide has provided the following solutions.
Invalid Memory Access
Try to assign value to a deallocated object.
int main()
{
int* myInts = new int[5] ;
delete[] myInts ;
myInts[0] = 2 ;
return 0 ;
}
A problem of type invalid memory access is shown. The lines where we are assigning value to the deallocated object and where the object is being allocated and deallocated are being listed out by the Inspector.
Here is a diagram which demonstrate the process of invalid memory access.
The Inspector User Guide has provided the following solutions.
Race Condition
Code with Race Condition
This program is written in C, it will calculate value of pi by calculating the area under a curve and it is using OpenMP library.
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
int main()
{
long long int i, n = 10000000;
double x, pi;
double sum = 0.0;
double step = 1.0 / (double)n;
#pragma omp parallel for private(i,x)
for (i = 0; i < n; i++)
{
x = (i + 0.5) * step;
sum += 4.0 / (1.0 + x * x);
}
pi = step * sum;
printf("pi = %17.15f\n", pi);
return 0;
}
There is a race condition happening on line 15. From the timeline at at the bottom right of the screenshot we can see that thread #1 and #2 are competing to write data to sum variable.
Here is a diagram which demonstrate the race condition of two threads are trying to write data.
At this time, thread #0 is trying to read data and thread #6 is trying to write data.
This shows that Inspector is able to capture the movement of the threads.
Here is a diagram which demonstrate the race condition of one thread is trying to read data while the other one is trying to write data.
The Inspector User Guide has provided the following solutions.
Fixed Code
This is the solution to above code.
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
int main()
{
long long int i, n = 10000000;
double x, pi;
double sum = 0.0;
double step = 1.0 / (double)n;
#pragma omp parallel for private(i,x)
for (i = 0; i < n; i++)
{
x = (i + 0.5) * step;
#pragma omp atomic
sum += 4.0 / (1.0 + x * x);
}
pi = step * sum;
printf("pi = %17.15f\n", pi);
return 0;
}