Kaggle|Exercise2: Model Validati

作者: 十二支箭 | 来源:发表于2020-04-09 17:02 被阅读0次

Kaggle|Exercise2: Model Validati
Kaggle|Exercise3:Underfitting an
JOS lab1
BO数据验证
模型融合stacking
Stacking的思路及编码
kaggle API配置
竞赛相关
Kaggle: Detect toxicity - Basic
swift Alamofire 用法介绍

来自kaggle官网的标准化机器学习流程。

Recap

You've built a model. In this exercise you will test how good your model is.

Run the cell below to set up your coding environment where the previous exercise left off.

# Code you have previously used to load data
import pandas as pd
from sklearn.tree import DecisionTreeRegressor

# Path of the file to read
iowa_file_path = '../input/home-data-for-ml-course/train.csv'

home_data = pd.read_csv(iowa_file_path)
y = home_data.SalePrice
feature_columns = ['LotArea', 'YearBuilt', '1stFlrSF', '2ndFlrSF', 'FullBath', 'BedroomAbvGr', 'TotRmsAbvGrd']
X = home_data[feature_columns]

# Specify Model
iowa_model = DecisionTreeRegressor()
# Fit Model
iowa_model.fit(X, y)

print("First in-sample predictions:", iowa_model.predict(X.head()))
print("Actual target values for those homes:", y.head().tolist())

# Set up code checking
from learntools.core import binder
binder.bind(globals())
from learntools.machine_learning.ex4 import *
print("Setup Complete")

First in-sample predictions: [208500. 181500. 223500. 140000. 250000.]
Actual target values for those homes: [208500, 181500, 223500, 140000, 250000]
Setup Complete

Exercises

Step 1: Split Your Data

Use the train_test_split function to split up your data.

Give it the argument random_state=1 so the check functions know what to expect when verifying your code.

Recall, your features are loaded in the DataFrame X and your target is loaded in y.

# Import the train_test_split function and uncomment
from sklearn.model_selection import train_test_split

# fill in and uncomment
train_X, val_X, train_y, val_y = train_test_split(X,y,random_state=1)

# Check your answer
step_1.check()

Step 2: Specify and Fit the Model

Create a DecisionTreeRegressor model and fit it to the relevant data.
Set random_state to 1 again when creating the model.

# You imported DecisionTreeRegressor in your last exercise
# and that code has been copied to the setup code above. So, no need to
# import it again

# Specify the model
iowa_model = DecisionTreeRegressor(random_state=1)

# Fit iowa_model with the training data.
iowa_model.fit(train_X,train_y)

# Check your answer
step_2.check()

Step 3: Make Predictions with Validation data

# Predict with all validation observations
val_predictions = iowa_model.predict(val_X)

# Check your answer
step_3.check()

Inspect your predictions and actual values from validation data.

# print the top few validation predictions
print(iowa_model.predict(val_X.head()))
#or print(val_predictions[:5])
#print the top few actual prices from validation data
print(val_y.head().tolist())

[186500. 184000. 130000.  92000. 164500.]
[231500, 179500, 122000, 84500, 142000]

What do you notice that is different from what you saw with in-sample predictions (which are printed after the top code cell in this page).

Do you remember why validation predictions differ from in-sample (or training) predictions? This is an important idea from the last lesson.

Step 4: Calculate the Mean Absolute Error in Validation Data

from sklearn.metrics import mean_absolute_error
val_mae = mean_absolute_error(val_y,val_predictions)

# uncomment following line to see the validation_mae
print(val_mae)

# Check your answer
step_4.check()

Is that MAE good? There isn't a general rule for what values are good that applies across applications. But you'll see how to use (and improve) this number in the next step.

Keep Going

You are ready for [Underfitting and Overfitting].

To be continued

网友评论

本文标题：Kaggle|Exercise2: Model Validati

本文链接：https://www.haomeiwen.com/subject/knwcmhtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Kaggle|Exercise2: Model Validati

Recap

Exercises

Step 1: Split Your Data

Step 2: Specify and Fit the Model

Step 3: Make Predictions with Validation data

Step 4: Calculate the Mean Absolute Error in Validation Data

Keep Going

To be continued

相关文章

Kaggle|Exercise2: Model Validati

Kaggle|Exercise3:Underfitting an

JOS lab1

BO数据验证

模型融合stacking

Stacking的思路及编码

kaggle API配置

竞赛相关

Kaggle: Detect toxicity - Basic

swift Alamofire 用法介绍

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读