ML之xgboost:基于xgboost(5f-CrVa)算法對(duì)HiggsBoson數(shù)據(jù)集(Kaggle競(jìng)賽)訓(xùn)練(模型保存+可視化)實(shí)現(xiàn)二分類預(yù)測(cè)
數(shù)據(jù)集簡(jiǎn)介
Dataset之HiggsBoson:Higgs Boson(Kaggle競(jìng)賽)數(shù)據(jù)集的簡(jiǎn)介,、下載、案例應(yīng)用之詳細(xì)攻略
輸出結(jié)果
更新中……
1,、交叉訓(xùn)練時(shí)間比較長(zhǎng),大約需要20多分鐘,。
設(shè)計(jì)思路
更新中……
核心代碼
更新中……
num_round = 1000
n_estimators = cvresult.shape[0]
print ('running cross validation, with preprocessing function')
# do cross validation, for each fold
# the dtrain, dtest, param will be passed into fpreproc
# then the return value of fpreproc will be used to generate results of that fold
cvresult = xgb.cv(param, dtrain, num_round, nfold=5,
metrics={'ams@0.15', 'auc'},
early_stopping_rounds=10, seed = 0,
fpreproc = fpreproc)
print ('finish cross validation','\n',cvresult)
print ('train model using the best parameters by cv ... ')
bst = xgb.train( param, dtrain, n_estimators )
bst.save_model('data_input/xgboost/data_higgsboson/higgs_cv.model')