7. Overfitting

作者: 玄语梨落 | 来源:发表于2020-08-17 13:57 被阅读0次

7. Overfitting
CS229-3分类问题
Overfitting
机器学习常见问题整理
Regularization
过拟合(Overfitting) 与 Dropout
bias/variance on validation set
【机器学习】-Week3 8. The Problem of O
softmax and overfitting
Overfitting and Regularization

Overfitting

The Problem of Overfitting

undefitting(high bias): The algroithm doesn't fit the training set.
overfiting(high variance): If we have too many featuers, the learned hypothesis may fit the training set very well, so your cost funciton may very close to zero, and maybe zero exactly, but fail to generalize to new examples.

addressing overfitting

Options:

Reduce number of features.
- Manually select which features to keep.
- Model selection algorithm.
Regularization
- Keep all the features, but reduce magnitude/values of parameters $\theta_j$ .
- Works well when we have a lot of features, each of which contributes a bit to predicting $y$ .

Regularization

Suppose we penalize and make $\theta_3,\theta_4$ (some of the parammerters) really small.

Regularization
Small values for parameters

"Smipler" hypothesis
Less prone to overfitting

$J(\theta)=\frac{1}{2m}[\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})^2+\lambda\sum_{j=1}^n\theta_j^2]$

Regularized Liner Regression

$\theta_0:=\theta_0-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x^{(i)} \newline \sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})x^{(i)} \newline \theta_j:=\theta_j-\alpha[\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_j^{(i)}+\frac{\lambda}{m}\theta_j] \newline \theta_j:=\theta_j(1-\alpha\frac{1}{m})-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_j^{(i)}$
We don't penalize $\theta_0$ .

Normal equation

$\theta=(X^TX+\lambda\begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix})^{-1}X^Ty$
The matrix should be a (n+1)x(n+1) matrix.

non-invertability issue

If $m\le n$ ,then $X^TX$ would be non-invertable or singler.
if $\lambda \ne0$ ,the matrix there would be invertable.

Regularized logistic retression

Cost function:
$J(\theta)=-[\frac{1}{m}\sum_{i=1}^my^{(i)}log \;h_\theta(x^{(i)})+(1-y^{(i)})log(1-h_\theta(x^{(i)}))]$

7. Overfitting
Overfitting The Problem of Overfitting undefitting(high b...
CS229-3分类问题
分类问题 1. overfitting and underfitting 过拟合（overfitting）：当学习...
Overfitting
Overfitting 对于一个model，如果Ein(h)小，而Eout(h)大时，说明该h的generaliz...
机器学习常见问题整理
1. 为什么说regularization是阻止overfitting的好办法? overfitting是由hi...
Regularization
overfitting 如果特征过多，但是训练集不够时，很有可能会出现overfitting 解决overfitt...
过拟合(Overfitting) 与 Dropout
一、过拟合(Overfitting) Overfitting 也被称为过度学习，过度拟合。它是机器学习中常见的问...
bias/variance on validation set
high bias——underfitting high variance——overfitting
【机器学习】-Week3 8. The Problem of O
The Problem of Overfitting Consider the problem of predic...
softmax and overfitting
softmax这个结果可以描述为每个类的概率故，不会造成学习慢！是根据信息熵的概念进行求解。 Overfitti...
Overfitting and Regularization
Overfitting and Regularization What should we do if our m...