Open main menu

CDOT Wiki β

Changes

GPU621/Pragmatic

6,293 bytes added, 22:50, 20 November 2016
Walkthrough
====Walkthrough====
In this part we will show you how to use '''Parallel Stacks''' window and how it can help you find bugs and trace the flow of the program containing multiple threads.
 
'''First''' of all - copy and paste the source code of the simple program we added below to the '''mainPartTwo.cpp''' file. We will be referring to line numbers further below, so it's important that the "''// FIRST LINE''" comment is '''actually at the first line''' of the '''mainPartTwo.cpp''' file.
 
[[File:Project_startup_parallel_stacks.JPG|200px|thumb|right|alt| Main project Window]]
[[File:Parallel_stacks_1.JPG|200px|thumb|right|alt| Program Flow]]
 
We'd like to say a few words about our program. We kept it simple, it consists of 11 functions + main function. In the main function we create a parallel region, send Master thread (with threadId == 0) to one place, and the rest of the threads is split between functions A() and B(). All the even threadIds go to A(), and all the odd threadIds go to B(). A() and B() in their turns create one more fork and split the threads further into a few functions, which makes a nice looking diagram in the Parallel Stacks window.
 
Alright, our program contains 2 bugs, let's find them together!
 
1. Put breakpoints to the lines: 59, 106, 116, 122, 132, 138, 148, 154, 164, 169, 179, 185, 195, 201, 211, 219, 227, 231, 240.
 
2. '''Debug''' -> '''Start Debugging''' to start our program in a debug mode. It will launch 2 terminals, you can close the terminal for the partOne at this point, you don't need it anymore.
 
3. Next step would be to open Parallel Stacks window. Once you run the application in the Debug mode - go to '''Debug''' -> '''Windows''' -> '''Parallel Stacks''' and give it some space (see Main Project Window screenshot on the right), otherwise the diagram will be too small.
 
4. At this point you should be able to see something similar to our Main project Window screenshot (on the right). In the '''Parallel Stacks''' window we can see that in the root we have a box with 5 threads. In fact it is just 4 threads. The fifth thread is related to debugging, it stays in the __kmp_launch_monitor task till the end of the program.
As you can see 1 thread went to the left, this is our Master thread with id == 0. And the rest, which is just 2 threads in our case (third hasn't been created yet), went to the right. Then they split and one went to function A() and another went to function B(). A, B, F etc - are function names, we kept it simple. Also, you can see on the screenshot (and hopefully in your Visual Studio as well) - in the top right box, in the function A is says "External Code". Our function A invoked printf() to print a Hello message in the console, and this is how it is displayed in the '''Parallel Stacks''' window.
 
5. Let's hit '''Continue'''(or '''F5''') button 3 times. Now you should be able to see something similar to the Program Flow screenshot from the right in your Parallel Stacks window. This is how you can trace and see what each thread is currently doing. In our case Master thread went to the left and is currently in the function K(). 3 other threads went to the right, one of them wen to A() and further to C(), and the other 2 threads got into B() which then split them and sent to functions G() and F().
 
6. Click '''Continue''' button '''6 more times''', observe the changes and stop here for a moment (give it some time between the clicks since there are long for loops which can take 1-3 seconds to execute). While you were going through the breakpoints you probably saw how our Master thread went from function L to function K, functions C() and G() executed '''#pragma omp barrier''' statements and are in a waiting mode now, and you saw that our function F() finished its work and went back to main.
 
7. Keep clicking '''Continue''' let's say 4 times and observe the changes in the Parallel Stacks window. Do you see a bug?
{| class="wikitable mw-collapsible mw-collapsed"
! See the answer
|-
| Our Master thread keeps going between functions L() and K(). This is an infinite loop and you just traced it using Parallel Stacks window!
|}
 
8. Let's fix this bug. Let's go to the line 51, remove a comment from the J() function call and comment out the call to K() function at the line 52.
 
9. One bug left. Now stop the program and run it in the debug mode again so that you didn't have this infinite loop in the program anymore. Don't forget to close the partOne terminal.
 
10. Once you run it - hit the '''Continue''' button '''7 times''' and observe the changes. You should be able to see how our Master thread went to J() function instead of the infinite loop between K() and L(). Also, you can see that blocks with functions G(), C(), and our Master thread (which went to J() ) executed barriers.
 
11. Hit '''Continue''' once again. At this point you should see your application getting stuck. You can't click '''Continue''' anymore, nothing's happening in the console. This is our second bug - which is also another example of Deadlock, the worst bug in the parallel programming, since it's so difficult to trace it.
 
12. Our '''Parallel Stacks''' window could give us a hint in this particular case. Before you broke your application - you saw how functions J(), G() and C() invoked barriers, but F() didn't. What's happening is that functions J(), G() and C() were waiting for F() to join them, but F() doesn't have a '''#pragma omp barrier''' statement, so F() just passes by without letting them know that it has finished its work, but they keep waiting.
 
13. Add '''#pragma omp barrier''' statement to the line 165 and go through the application again. Everything is supposed to work now.
 
You fixed both bugs from the Part 2, congratulations!
 
Last but not least, you can try to run this application with 8 threads if your CPU allows it. You will see a bigger picture of the program in the Parallel Stacks.
To do this:
 
- check how many CPU Cores you have. In Windows 7 you can do this by going to '''Task Manager''' (Ctrl + Shift + Esc) -> '''Performance'''. Count the number of boxes under '''CPU Usage History''', if you have 4 - you don't need to change anything.
- If it's 8 - you might want to change our default number of threads from 4 to 8 at the line 30, here:
 
omp_set_num_threads(4);
====Source Code====