什么是机器学习
filed of study that gives computers the ability to learn without being explicity programmed.
让计算机在没有明确编程的情况下学习的研究领域
– Arthur Samuel (1959)
Supervised Learning 监督学习
learns from being given ‘right answers’
learns from data labeled with ‘right answers’
regression algorithms 回归算法
从无限多可能数字中预测数字
predict a number 预测无限可能中的一种
infinitely many possible outputs
classify algorithms 分类算法
predict categories 预测有限分类中的一类
small number of possible outputs
Unsupervised Learning 无监督学习
find sth interesting in unlabeled data
data only comes with input x ,but not output labels y, algorithm has to find structure in the data
clustering algorithms 聚类算法
place the unlabeled data (automatically group) into different clusters
eg. google news, grouping customers
==> group similar data points together
Anomaly detection 异常检测
find unusal data points (events)
eg. 金融诈骗中的交易异常
Dimensionality reduction 降维算法
compressn data using fewer numbers
压缩大数据集到小数据集,同时丢失尽可能少的信息