1. This week reading assignment is the course textbook chapter 2 (EMC Education Service (Eds). (2015) Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing, and Presenting Data, Indianapolis, IN: John Wiley & Sons, Inc).
2. Read: https://en.wikipedia.org/wiki/Systems_development_…
3. Read https://medium.com/data-ops/dataops-is-not-just-de…
This week assignments, due date, format and grading percent are shown in the table below.
Data mining is a complex process that aims to discover patterns in large data sets starting from a collection of existing data. In my opinion, data mining contains four main steps:
Classification algorithms predict one or more discrete variables, based on the other attributes in the dataset.
Regression algorithms predict one or more continuous numeric variables, such as profit or loss, based on other attributes in the dataset.
Segmentation/Clustering algorithms divide data into groups, or clusters, of items that have similar properties.
Association algorithms find correlations between different attributes in a dataset. The most common application of this kind of algorithm is for creating association rules, which can be used in a market basket analysis.
Sequence analysis algorithms summarize frequent sequences or episodes in data, such as a series of clicks in a web site, or a series of log events preceding machine maintenance.
1. Read Chapter 2 of Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data, EMC Education Services (Editor)ISBN: 978-1-118-87613-8 January 2015.
2. Read: https://en.wikipedia.org/wiki/Systems_development_…
3. Read https://medium.com/data-ops/dataops-is-not-just-de…
DataOps is NOT Just DevOps for Data