49
edits
Changes
→Data Management
*Data preperation
*Algorithim computation
Data management refers to a set of operations that work on the data and are distributed between the stages of the data analytics pipeline. The data management flow is shown in the figure below. You start with your raw data and its acquisition. The first step is to transfer the out of memory data, the source could be from files, databases, or remote storage, into an in-memory representation.
[[File:ManagemenFlowDal.jpg|900px]] <br/>
This is a tabular view of data where the columns represent features, these are properties or qualities of a real object or event, and the table rows represent observations which are feature vectors that are used to encode information for a real object or event. This is used in our machine learning regression example.