Project Name
Parallel Programming in Go & Java
Group Members
Project Description
This project will look at the performance difference between /Go/Java using a classic matrix multiplication algorithm. It will present how to do parallel program in each language and show the different syntax codes that's using the similar algorithm. We will gather the results of running the code and show the graphical result to clearly show to performance difference between the two programming language. It will also briefly cover the history of each language, key term of parallel programming, look at the fundamental difference between two languages to see what may have caused the performance difference.
Progress
What's GO Language?
Go Language is a open source programming language that is designed at Google by Robert Griesemer, Ken Thompson, and Rob Pike.
Go lang is a Fast and Pretty language.
Go language is a compiler language which makes its execution fast. Go is faster than Python & JavaScript. In some cases, Go may be faster than Java or as fast as Java.
Go also has a relatively easy syntax which makes it easy to learn and fast to develop.
Is Go Popular?
GO is used by Google, Youtube, Google Download Server, Uber replaced Node.JS to Go, Twitch's trance-coding process Go & C++.
What's Java Language?
Java is open source and Object-oriented programming language developed by Sun Microsystems in 1995. The founder is James Gosling. In 2010, Oracle acquired Sun Microsystems and owned Java's copyright.
Java is simple programming language, easy to learn, clean and easy to understand. This is because syntax of Java is based on C++ that has removed many confusing and rarely used functionality by user (e.g. operator overloading, explicit pointer, automatic garbage collection etc.)
Java is the platform independent language. Programs can run on several different types of computer as long as computer has a Java Runtime Environment(JRE) installed.
Parallel Programming in GO & Java
Parallelism and Concurrency
Rob Pike who is developer of Go language has defined parallelism and concurrency in his speech "Concurrency is not Parallelism":
Parallelism defined as
Parallelism is about doing lots of things at once.
Concurrency defined as
Concurrency is about dealing with lots of things at once.
Concurrency allows to execute multiple tasks in partial order or out of order. The executions must finished before the next execution can start. While, parallelism has simultaneous execution on a multicore per CPU.
If we combine two concepts:
Concurrency Concurrency + parallelism (Single-Core CPU) (Multi-Core CPU) ___ ___ ___ |th1| |th1|th2| | | | |___| |___|___ | |___ |th2| |___|th2| ___|___| ___|___| |th1| |th1| |___|___ | |___ |th2| | |th2|
diagram cited by Pithikos in Stack Overflow
Go language focuses on concurrency and parallelism by implementing its own Goroutine instead of regular threading. Goroutines uses the built-in scheduler to conduct the threads in behind and hide many complexity of thread management from the user to specify which function to run and execute them with “Go” command.
Java language can implement thread in two different methods. The first method is to create a class which extends the “Thread” class that gives the user to have control of the lifecycle of the thread. But Java language only allows single inheritance which prevents the thread from being extend other base class. The second method is using ‘Runnable’ interface. This method allows to user to extend other base class while still being run as a new thread. However, this method does not give user to control lifecycle of thread.
Java version 5.0 and later provides concurrency to simplify thread creation and processing. The "Executor" that implement by java.util.concurrent provides APIs for creating and processing new threads using reusable threads.
Implementation for Go
Installation of Go
Go to https://golang.org/dl/ & Download Go accordingly to your OS.
Multi-Thread Program in Go
package main import ( "bufio" "fmt" "log" "os" "runtime" "runtime/debug" "strconv" "strings" "sync" "time" ) func multiply(A [][]int, B [][]int) [][]int { sizeA := len(A) sizeB := len(B) n := make([][]int, sizeA) for i := range n { n[i] = make([]int, sizeB) } for i := 0; i < sizeA; i++ { for k := 0; k < sizeB; k++ { temp := A[i][k] for j := 0; j < sizeB; j++ { n[i][j] += temp * B[k][j] } } } return n } func splitMatrix(nrOfThreads int, matrix [][]int) (matrixes [][][]int) { splitter := len(matrix) / nrOfThreads for i := 0; i < nrOfThreads; i++ { matrixes = append(matrixes, matrix[splitter*i:(splitter*(i+1))]) } return } func multiplyStuff(finalMatrix *[][][]int, matrix1 [][]int, matrix2 [][]int, i int) { (*finalMatrix)[i] = multiply(matrix1, matrix2) } func readFile(filePath string) (matrix1 [][]int, matrix2 [][]int) { file, err := os.Open(filePath) if err != nil { log.Fatal(err) } defer file.Close() var temp []int matrixNr := 1 scanner := bufio.NewScanner(file) for scanner.Scan() { words := strings.Fields(scanner.Text()) if len(words) != 0 { for _, element := range words { i, err := strconv.Atoi(element) if err != nil { log.Fatal(err) } temp = append(temp, i) } if matrixNr == 1 { matrix1 = append(matrix1, temp) matrix2 = append(matrix2, temp) } temp = nil } else { matrixNr = 2 } } if err := scanner.Err(); err != nil { log.Fatal(err) } return } func main() { file := os.Args[1] nrOfThreads, err := strconv.Atoi(os.Args[2]) if err != nil { log.Fatal("USAGE: " + os.Args[0] + " <file> <nrOfThreads>") } debug.SetGCPercent(-1) if nrOfThreads <= 0 { runtime.GOMAXPROCS(1) } else if nrOfThreads >= 16 { runtime.GOMAXPROCS(8) } else { runtime.GOMAXPROCS(nrOfThreads) } var wg sync.WaitGroup finishedMatrix := make([][][]int, nrOfThreads) matrix1, matrix2 := readFile(file) if len(matrix1) != len(matrix2) || (nrOfThreads != 0 && len(matrix1)%nrOfThreads != 0) { log.Fatal("USAGE: " + os.Args[0] + " <file> <nrOfThreads>") } var start int64 if nrOfThreads == 0 { start = time.Now().UnixNano() multiply(matrix1, matrix2) } else { matrixes := splitMatrix(nrOfThreads, matrix1) start = time.Now().UnixNano() for i := 0; i < nrOfThreads; i++ { wg.Add(1) go func(index int) { defer wg.Done() multiplyStuff(&finishedMatrix, matrixes[index], matrix2, index) }(i) } wg.Wait() } end := time.Now().UnixNano() fmt.Printf("Execution took %d ns\n", (end - start)) runtime.GC() }
.sh Code to Run Go Program
Above Go code will run in terminal using .sh script.
#! /bin/sh COUNTER=0 MAX=7 go build MatrixMultiGo.go > result.txt echo "Running tests for no multithreading..." echo "### No Multithreading" >> result.txt while [ $COUNTER -lt $MAX ] do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 ./MatrixMultiGo 4096.in 0 >> result.txt done COUNTER=0 echo "Running tests for 1 thread..." echo "### Threading with 1 thread" >> result.txt while [ $COUNTER -lt $MAX ] do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 ./MatrixMultiGo 4096.in 1 >> result.txt done COUNTER=0 echo "Running tests for 2 threads..." echo "### Multithreading with 2 threads" >> result.txt while [ $COUNTER -lt $MAX ] do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 ./MatrixMultiGo 4096.i 0>> result.txt done COUNTER=0 echo "Running tests for 4 threads..." echo "### Multithreading with 4 threads" >> result.txt while [ $COUNTER -lt $MAX ] do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 ./MatrixMultiGo 4096.in 4 >> result.txt done COUNTER=0 echo "Running tests for 8 threads..." echo "### Multithreading with 8 threads" >> result.txt while [ $COUNTER -lt $MAX ] do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 ./MatrixMultiGo 4096.in 8 >> result.txt done COUNTER=0 echo "Running tests for 16 threads..." echo "### Multithreading with 16 threads" >> result.txt while [ $COUNTER -lt $MAX ] do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 ./MatrixMultiGo 4096.in 16 >> result.txt done echo "Done";
Implementation for Java
Installation of Java
Go to https://www.java.com/ko/download/manual.jsp & Download Java accordingly to your OS.
Multi-Thread Program in Java
import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import java.util.LinkedList; import java.util.List; import java.util.ArrayList; public class Java_Thread { public List<ArrayList<ArrayList<Integer>>> read(String filename) { ArrayList<ArrayList<Integer>> A = new ArrayList<ArrayList<Integer>>(); ArrayList<ArrayList<Integer>> B = new ArrayList<ArrayList<Integer>>(); String thisLine; try { BufferedReader br = new BufferedReader(new FileReader(filename)); while ((thisLine = br.readLine()) != null) { if (thisLine.trim().equals("")) { break; } else { ArrayList<Integer> line = new ArrayList<Integer>(); String[] lineArray = thisLine.split("\t"); for (String number : lineArray) { line.add(Integer.parseInt(number)); } A.add(line); B.add(line); } } br.close(); } catch (IOException e) { System.err.println("Error: " + e); } List<ArrayList<ArrayList<Integer>>> res = new LinkedList<ArrayList<ArrayList<Integer>>>(); res.add(A); res.add(B); return res; } public int[][] matrixMultiplication(ArrayList<ArrayList<Integer>> A, ArrayList<ArrayList<Integer>> B, int m, int n) { int[][] C = new int[m][n]; for (int i = 0; i < m; i++) { for (int k = 0; k < n; k++) { int temp = A.get(i).get(k); for (int j = 0; j < n; j++) { C[i][j] += temp * B.get(k).get(j); } } } return C; } public ArrayList<ArrayList<ArrayList<Integer>>> splitMatrix(ArrayList<ArrayList<Integer>> A, int nrOfThreads) { int n = A.size(); int m = n / nrOfThreads; ArrayList<ArrayList<ArrayList<Integer>>> B = new ArrayList<ArrayList<ArrayList<Integer>>>(); for (int i = 0; i < nrOfThreads; i++){ B.add(new ArrayList<ArrayList<Integer>>(A.subList(i*m, (i+1)*m))); } return B; } public static void main(String[] args) { String filename = ""; int nrOfThreads = 0; if (args.length < 2) { System.out.println("Missing filename and/or number of threads\nUSAGE: java Java_Thread <file> <nrOfThreads>"); System.exit(1); } else { filename = args[0]; nrOfThreads = Integer.parseInt(args[1]); Java_Thread java_Thread = new Java_Thread(); java_Thread.start(filename, nrOfThreads); } } public void start(String filename, int nrOfThreads) { List<ArrayList<ArrayList<Integer>>> matrices = read(filename); ArrayList<ArrayList<Integer>> A = matrices.get(0); ArrayList<ArrayList<Integer>> B = matrices.get(1); if (nrOfThreads <= 0) { int n = A.size(); long startTime = System.nanoTime(); int[][] C = matrixMultiplication(A, B, n, n); long endTime = System.nanoTime(); System.out.println("Execution took " + (endTime - startTime) + " ns"); } else { if (A.size() % nrOfThreads != 0) { System.out.println("Size of matrix is not divisible by the supplied number of threads"); System.exit(1); } ArrayList<Worker> threads = new ArrayList<Worker>(); ArrayList<int[][]> result = new ArrayList<int[][]>(); int[][] empty = new int[][]{{}}; for (int i = 0; i < nrOfThreads; i++) { result.add(empty); } ArrayList<ArrayList<ArrayList<Integer>>> workerMatrices = splitMatrix(A, nrOfThreads); long startTime = System.nanoTime(); for (int i = 0; i < nrOfThreads; i++) { i, result)); } threads.add(new Worker(workerMatrices.get(i), B, threads.get(i).start(); for (int i = 0; i < nrOfThreads; i++) { try { threads.get(i).join(); } catch (Exception e) { System.err.println(e); } } long endTime = System.nanoTime(); System.out.println("Execution took " + (endTime - startTime) + " ns"); } } class Worker extends Thread { private ArrayList<ArrayList<Integer>> A; private ArrayList<ArrayList<Integer>> B; private int index; private ArrayList<int[][]> result; private int m; private int n; public Worker(ArrayList<ArrayList<Integer>> A, ArrayList<ArrayList<Integer>> B, int index, ArrayList<int[][]> result) { this.A = A; this.B = B; this.index = index; this.result = result; this.m = A.size(); this.n = B.size(); } @Override public void run() { this.result.set(this.index, matrixMultiplication(this.A, this.B, this.m, this.n)); } } }
.sh Code to Run Java Program
#! /bin/sh COUNTER=0 MAX=7 javac Java_Thread.java > result.txt echo "Running tests for no multithreading..." echo "### No Multithreading" >> result.txt while [ $COUNTER -lt $MAX ]; do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 java -Xms8g -Xmx8g Java_Thread 4096.in 0 >> result.txt done COUNTER=0 echo "Running tests for 1 thread..." echo "### Threading with 1 thread" >> result.txt while [ $COUNTER -lt $MAX ]; do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 java -Xms8g -Xmx8g Java_Thread 4096.in 1 >> result.txt done COUNTER=0 echo "Running tests for 2 threads..." echo "### Multithreading with 2 threads" >> result.txt while [ $COUNTER -lt $MAX ]; do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 java -Xms8g -Xmx8g Java_Thread 4096.in 2 >> result.txt done COUNTER=0 echo "Running tests for 4 threads..." echo "### Multithreading with 4 threads" >> result.txt while [ $COUNTER -lt $MAX ]; do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 java -Xms8g -Xmx8g Java_Thread 4096.in 4 >> result.txt done COUNTER=0 echo "Running tests for 8 threads..." echo "### Multithreading with 8 threads" >> result.txt while [ $COUNTER -lt $MAX ]; do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 java -Xms8g -Xmx8g Java_Thread 4096.in 8 >> result.txt done COUNTER=0 echo "Running tests for 16 threads..." echo "### Multithreading with 16 threads" >> result.txt while [ $COUNTER -lt $MAX ]; do ((COUNTER++)) echo "Running test #$COUNTER" nice -n -20 java -Xms8g -Xmx8g Java_Thread 4096.in 16 >> result.txt done echo "Done";
Result
The data used in the experiment is a 4096x4096 matrix
Go | Java | |
---|---|---|
Main thread | 84.66 s | 171.76 s |
1 thread | 84.28 s | 169.51 s |
2 thread | 37.08 s | 73.04 s |
4 thread | 19.2 s | 34.91 s |
8 thread | 11.41 s | 17.63 s |
16 thread | 14.38 s | 16.08 s |
difference | |
---|---|
Main Thread to 1 thread | 2.25% |
1 thread to 2 threads | 56.91% |
2 threads to 4 threads | 51.90% |
4 threads to 8 threads | 49.49 |
8 threads to 16 threads | 8.79% |
difference | |
---|---|
Main Thread to 1 thread | 0.45% |
1 thread to 2 threads | 127.30% |
2 threads to 4 threads | 93.12% |
4 threads to 8 threads | 68.27% |
8 threads to 16 threads | -20.65% |
difference | |
---|---|
Main thread | 102% |
1 thread | 101% |
2 thread | 96.90% |
4 thread | 81.80% |
8 thread | 55% |
16 thread | 12% |