Open main menu

CDOT Wiki β

Changes

GPU621/Go Parallel

4,665 bytes added, 20:49, 19 November 2020
Added the src code and the analysis
=== Code Differences ===
''' Go Code for Monte Carlo Estimation of Pi '''
<source>
package main
import (
"fmt"
"math"
"math/rand"
"os"
"runtime"
"strconv"
"sync"
"time"
)
 
func monte_carlo_pi(radius float64, reps int, result *int, wait *sync.WaitGroup) {
var x, y float64
count := 0
seed := rand.NewSource(time.Now().UnixNano()) // creates a seed using time
random := rand.New(seed) // create a new random genrator using the seed
 
for i := 0; i < reps; i++ { // for the number of reps we get a random number and multiple it by radius
x = random.Float64() * radius
y = random.Float64() * radius
 
if num := math.Sqrt(x*x + y*y); num < radius { // checks to see if num is inside unit circle
count++
}
}
 
*result = count
wait.Done()
}
 
func main() {
cores := runtime.NumCPU() // grabs the numof cpu on runtime
runtime.GOMAXPROCS(cores) // gets the number of thread set by the os
 
var wait sync.WaitGroup
 
counts := make([]int, cores) // creates an array depending on the number of cores
 
samples, _ := strconv.Atoi(os.Args[1]) // gets the number of samples
 
start := time.Now() // gets the starting time;
wait.Add(cores) // adds the amount of cores to wait for
 
for i := 0; i < cores; i++ {
go monte_carlo_pi(100.0, samples/cores, &counts[i], &wait) // uses goroutines to start threading interms of the runtime GOMAXPROCS
}
 
wait.Wait() // waits until all cores have finished
 
total := 0
for i := 0; i < cores; i++ {
total += counts[i]
}
 
pi := (float64(total) / float64(samples)) * 4 // gets the value of pi
 
fmt.Println("Time: ", time.Since(start))
fmt.Println("pi: ", pi)
fmt.Println("")
}
 
</source>
''' C++ openMP Code for Monte Carlo Estimation of Pi '''
<source>
/*
* Compute pi by Monte Carlo calculation of area of a circle
*
* parallel version using OpenMP
*/
#include <iostream>
#include <cstdlib>
#include <chrono>
#include <omp.h>
using namespace std::chrono;
using namespace std;
void reportTime(const char *msg, steady_clock::duration span)
{
auto ms = duration_cast<milliseconds>(span);
std::cout << msg << " - took - " << ms.count() << " milliseconds" << std::endl;
}
int main(int argc, char *argv[])
{
 
const char Usage[] = "Usage: pi <steps> <repeats> (try 1000000 4)";
if (argc < 3)
{
cerr << Usage << endl;
return (1);
}
int num_steps = atoi(argv[1]);
int num_repeats = atoi(argv[2]);
 
printf("Computing pi via Monte Carlo using %d steps, repeating %d times\n",
num_steps, num_repeats);
 
// A little throwaway parallel section just to show num threads
#pragma omp parallel
#pragma omp master
printf("Using %d threads\n", omp_get_num_threads());
 
steady_clock::time_point ts, te;
 
int count = 0;
ts = steady_clock::now();
#pragma omp parallel for reduction(+ \
: count) //redection of the count
 
for (int i = 0; i < num_steps; i++)
{
double x = (double)rand() / RAND_MAX;
double y = (double)rand() / RAND_MAX;
if (x * x + y * y < 1) // checks if values are in the unit circle
count++;
}
te = steady_clock::now();
double pi = 4.0 * count / num_steps;
printf("pi = %lf \n", pi);
reportTime("pi", te - ts);
printf("\n");
return 0;
}
</source>
=== Time Differences ===
The time difference between the Go code and the C++ code gets significantly large as the sample size increases. The below table will show a the difference in time between the 2 source codes.
 
 
{| class="wikitable"
|+ The difference between the two source codes
|-
! Sample Size !! C++ OMP !! Go Parallel !! Difference
|-
| 100,000 || 8ms || 0.7ms || 11 times
|-
| 1,000,000 || 79ms || 1.9ms || 42 times
|-
| 10,000,000 || 1005ms || 10ms || 101 times
|-
| 100,000,000 || 9757ms || 105ms || 92 times
|-
| 1,000,000,000 || 97525ms || 1094ms || 89 times
|}
'''Note :'''
Go is using 1 CPU core with the max number of threads available on the OS. Where as C++ is using 8 threads. Go is using default complier settings. C++ is using G++ at O3 optimization.
 
 
The difference is to large to be shown on a graph.
=== Ease of Use ===
Overall, the ease of use of Go is much simpler as Go has built-in support for concurrency and parallelism. While, C++ needs the assistance of external dependencies like open MP. The other thing Go uses CPU cores and then threads it on each core available with the max amount of threads available making the comparison sort of unfair as Go is automatically using the best configuration for the solution. While, In C++ with open MP you have to specify how many to use externally. Go is jsut much easier to use an implement a parallel/Concurrent solution than C++.
== Relevance ==
2
edits