最近一边看模型,一边学习数学,还是自己实现一下好点
- 导入数据集,可以直接使用
Scikit-learn自带的胸癌数据集
from sklearn.datasets import load_breast_cancer
breast_cancer = load_breast_cancer()
samples = breast_cancer.data # 【样本】(或者也叫【示例】)
label = breast_cancer.target # 【标记】
- 划分数据集
from sklearn.model_selection import train_test_split
# 将矩阵随机划分成训练集和测试集, test_size表示测试集的比例(即训练集:测试集=7:3)
sample_train, sample_test, label_train, label_test = train_test_split(samples, label, test_size=0.3)
- 导入回归模型并训练
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression()
classifier.fit(sample_train, label_train) # 训练分类模型(Fit the model according to the given training data)
- 查看准确率
count = 0
Length = len(label_test)
for i in range(Length):
if classifier.predict(sample_test)[i] != label_test[i]: # 预测测试样本
count += 1
- 所有代码如下:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
# load the dataset: breast_cancer
breast_cancer = load_breast_cancer()
samples = breast_cancer.data
label = breast_cancer.target
# 将矩阵随机划分成训练集和测试集, test_size表示测试集的比例(即训练集:测试集=7:3)
sample_train, sample_test, label_train, label_test = train_test_split(samples, label, test_size=0.3)
classifier = LogisticRegression() # 使用类,参数全是默认的
classifier.fit(sample_train, label_train) # 训练分类模型(Fit the model according to the given training data)
count = 0
Length = len(label_test)
for i in range(Length):
if classifier.predict(sample_test)[i] != label_test[i]: # 预测测试样本
count += 1
- 结果输出:(随机划分的训练集和测试集,所以每次预测效果可能都不一样)
8
0.9532163742690059









网友评论