The test was performed over 99999999 data points with learn rate of .001 and 100 times calculation.
(i.e. <source>$<cmd> 99999999 .001 100</source> ) Also, after each case, we performed an analysis using vTune Amplifier to understand the performance issues and compare them.
===Single-thread===
Single threaded approach to finding the line of best fit.