Changes

Jump to: navigation, search

DPS921/Franky

1,209 bytes added, 00:54, 26 November 2018
Optimized with DAAL
Intel has a library called the Data Analytics Acceleration Library. It is used to solve big data problems, and the library contains optimized algorithmic building blocks to efficient solutions.
The library includes algorithms to solve all sorts of machine learning problems, including linear regression.
 
Two sets of data were generated from the serial version of the regression algorithm. The serial version was run twice, and the x[N], and y[N] arrays from the random normal number generator were written to two different csv files called test.csv, and train.csv. The x[N] and y[N] values in these two files follow a normal distribution as defined in the serial algorithm code, with N = 99,999,999.
 
The function called "lin_reg_norm_eq_dense_batch.cpp" in the DAAL library was manipulated to test the linear regression model. First, the function "trainModel()" is called. This function reads the "train.csv" data,
and then merges the columns based on the number of independent and dependent variables, in this case it is simple regression with 1 dependent and 1 independent variable. An optimized algorithm is then initialized, training data and dependent values are passed in, and trained based on the data within the csv file. A training result is produced, which is a line of best fit model for the data. The "testModel()" function is then called, which initilized a test algorithm. The algorithm works by passing the dependent variable into the training model, and the independent values are predicted.
 
 
====Code====
====Performance====
14
edits

Navigation menu