0%

Cross Validation 详解

Hey

Machine Learning notes

Cross Validation 详解

Image text

1
2
3
4
5
from sklearn.model_selection import cross_val_score
knn_clf = KNeighborsClassifier()
# cv参数为交叉时候分成几份(k-folds)
scores = cross_val_score(knn_clf,x_train,y_train,cv=10)
scores = np.mean(scores)

交叉验证的目的就是找到最好的超参数所以传入的数据只是train的数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from sklearn.model_selection import GridSearchCV
# 事实上网格搜受已经使用了交叉验证了
para,_grid = [
{
'weights':['uniform'],
'n_neighbors':[i for i in range(1,11)]
},
{
'weights':['distance'],
'n_neighbors':[i for i in range(1,11)],
'p':[i for i in range(1,6)]
}
]
# n_jobs表示的处理的计算的核数,verbose可以在运行的程序的时候,可以显示信息
knn_clf = KneighborsClassifier()
grid_search = GridSearchCV(knn_clf,param_grid,n_jobs=-1,verbose=2)
grid_search.fit(X_train,y_train)
# 获取训练集的最好结果
grid_search.best_score_
# 获取训练集的最好超参数
grid_search.best_params_
# 获得最好的分类器
grid_search.best_estimator_

留一法(LOO-CV) 虽然准确,但是计算量巨大