Changes

GPU621/Go Parallel

4,665 bytes added, 20:49, 19 November 2020

Added the src code and the analysis

=== Code Differences ===

''' Go Code for Monte Carlo Estimation of Pi '''

package main

import (

"fmt"

"math"

"math/rand"

"os"

"runtime"

"strconv"

"sync"

"time"

)

func monte_carlo_pi(radius float64, reps int, result *int, wait *sync.WaitGroup) {

var x, y float64

count := 0

seed := rand.NewSource(time.Now().UnixNano()) // creates a seed using time

random := rand.New(seed) // create a new random genrator using the seed

for i := 0; i < reps; i++ { // for the number of reps we get a random number and multiple it by radius

x = random.Float64() * radius

y = random.Float64() * radius

if num := math.Sqrt(x*x + y*y); num < radius { // checks to see if num is inside unit circle

count++

}

*result = count

wait.Done()

}

func main() {

cores := runtime.NumCPU() // grabs the numof cpu on runtime

runtime.GOMAXPROCS(cores) // gets the number of thread set by the os

var wait sync.WaitGroup

counts := make([]int, cores) // creates an array depending on the number of cores

samples, _ := strconv.Atoi(os.Args[1]) // gets the number of samples

start := time.Now() // gets the starting time;

wait.Add(cores) // adds the amount of cores to wait for

for i := 0; i < cores; i++ {

go monte_carlo_pi(100.0, samples/cores, &counts[i], &wait) // uses goroutines to start threading interms of the runtime GOMAXPROCS

}

wait.Wait() // waits until all cores have finished

total := 0

for i := 0; i < cores; i++ {

total += counts[i]

}

pi := (float64(total) / float64(samples)) * 4 // gets the value of pi

fmt.Println("Time: ", time.Since(start))

fmt.Println("pi: ", pi)

fmt.Println("")

}

</source>

''' C++ openMP Code for Monte Carlo Estimation of Pi '''

/*

* Compute pi by Monte Carlo calculation of area of a circle

*

* parallel version using OpenMP

*/

#include <iostream>

#include <cstdlib>

#include <chrono>

#include <omp.h>

using namespace std::chrono;

using namespace std;

void reportTime(const char *msg, steady_clock::duration span)

{

auto ms = duration_cast<milliseconds>(span);

std::cout << msg << " - took - " << ms.count() << " milliseconds" << std::endl;

}

int main(int argc, char *argv[])

{

const char Usage[] = "Usage: pi <steps> <repeats> (try 1000000 4)";

if (argc < 3)

{

cerr << Usage << endl;

return (1);

}

int num_steps = atoi(argv[1]);

int num_repeats = atoi(argv[2]);

printf("Computing pi via Monte Carlo using %d steps, repeating %d times\n",

num_steps, num_repeats);

// A little throwaway parallel section just to show num threads

#pragma omp parallel

#pragma omp master

printf("Using %d threads\n", omp_get_num_threads());

steady_clock::time_point ts, te;

int count = 0;

ts = steady_clock::now();

#pragma omp parallel for reduction(+ \

: count) //redection of the count

for (int i = 0; i < num_steps; i++)

{

double x = (double)rand() / RAND_MAX;

double y = (double)rand() / RAND_MAX;

if (x * x + y * y < 1) // checks if values are in the unit circle

count++;

}

te = steady_clock::now();

double pi = 4.0 * count / num_steps;

printf("pi = %lf \n", pi);

reportTime("pi", te - ts);

printf("\n");

return 0;

}

</source>

=== Time Differences ===

The time difference between the Go code and the C++ code gets significantly large as the sample size increases. The below table will show a the difference in time between the 2 source codes.

{| class="wikitable"

|+ The difference between the two source codes

|-

! Sample Size !! C++ OMP !! Go Parallel !! Difference

|-

| 100,000 || 8ms || 0.7ms || 11 times

|-

| 1,000,000 || 79ms || 1.9ms || 42 times

|-

| 10,000,000 || 1005ms || 10ms || 101 times

|-

| 100,000,000 || 9757ms || 105ms || 92 times

|-

| 1,000,000,000 || 97525ms || 1094ms || 89 times

|}

'''Note :'''

Go is using 1 CPU core with the max number of threads available on the OS. Where as C++ is using 8 threads. Go is using default complier settings. C++ is using G++ at O3 optimization.

The difference is to large to be shown on a graph.

=== Ease of Use ===

Overall, the ease of use of Go is much simpler as Go has built-in support for concurrency and parallelism. While, C++ needs the assistance of external dependencies like open MP. The other thing Go uses CPU cores and then threads it on each core available with the max amount of threads available making the comparison sort of unfair as Go is automatically using the best configuration for the solution. While, In C++ with open MP you have to specify how many to use externally. Go is jsut much easier to use an implement a parallel/Concurrent solution than C++.

== Relevance ==

Zg3d

2

edits

CDOT Wiki β

Changes

GPU621/Go Parallel

CDOT Wiki ^β