学习DecisionTrees

作者: _mora | 来源:发表于2018-12-30 10:17 被阅读10次

学习DecisionTrees
学习学习学习
学习学习学习
学习学习学习！
学习学习学习
学习学习学习
学习！学习！学习！
学习！学习！学习！
学习学习学习！
学习，学习，学习！

本周比较忙碌，遇上节假日回家，本周学习任务原先还有朴素贝叶斯模块，但是没有完成，下周抓紧补上。好在之前接触过，现在的学习更加偏重原理性，所以有难度，但目前还可以接受。

简单回顾一下本周的学习内容：

1、学习了计算机在推荐方案上的思考模式

recommending apps

decission tree

2、Entropy 熵和计算公式

1

2

3、Imformation Gain信息增益

下面三种分割方法中，那种方法会使我们获得更多有关数据的信息

information gain

信息增益=熵的变化值

在决策树中的每一个节点处，我们可以计算父节点处数据的熵，然后计算两个子节点的熵，父节点的熵与子节点熵的平均值之间的差值即为信息增益。

信息增益的计算公式

在构建决策树的时候，选择得到信息增益最大的方法。

4、Hyperparameters

（1）Maximum depth

（2）Minimum mumber of samples per leaf

（3）Minimum number of samples per split

（4）Maximum number of features

5、Decission Tree in sklearn

>>> from sklearn.tree import DecisionTreeClassifier

>>> model = DecisionTreeClassifier()

>>> model.fit(x_values, y_values)

>>> print(model.predict([ [0.2, 0.8], [0.5, 0.4] ]))

[[ 0., 1.]]

Hyperparameters

When we define the model, we can specify the hyperparameters. In practice, the most common ones are

max_depth: The maximum number of levels in the tree.

min_samples_leaf: The minimum number of samples allowed in a leaf.

min_samples_split: The minimum number of samples required to split an internal node.

max_features : The number of features to consider when looking for the best split.

For example, here we define a model where the maximum depth of the trees max_depth is 7, and the minimum number of elements in each leaf min_samples_leaf is 10.

>>> model = DecisionTreeClassifier(max_depth = 7, min_samples_leaf = 10)

>>>from sklearn.metrics import accuracy_score

>>>acc = accuracy_score(y,y_pred)

网友评论

数据蛙数据分析每周作业

本文标题：学习DecisionTrees

本文链接：https://www.haomeiwen.com/subject/titvlqtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

学习DecisionTrees

相关文章