【原】ML之xgboost：基于xgboost(5f-CrVa)算法對(duì)HiggsBoson數(shù)據(jù)集(Kaggle競(jìng)賽)訓(xùn)練(模型保存+可視化)實(shí)現(xiàn)二分類預(yù)測(cè)

處女座的程序猿 2021-09-28

展開全文

ML之xgboost：基于xgboost(5f-CrVa)算法對(duì)HiggsBoson數(shù)據(jù)集(Kaggle競(jìng)賽)訓(xùn)練(模型保存+可視化)實(shí)現(xiàn)二分類預(yù)測(cè)

數(shù)據(jù)集簡(jiǎn)介

Dataset之HiggsBoson：Higgs Boson(Kaggle競(jìng)賽)數(shù)據(jù)集的簡(jiǎn)介,、下載、案例應(yīng)用之詳細(xì)攻略

輸出結(jié)果

更新中……

1,、交叉訓(xùn)練時(shí)間比較長(zhǎng)，大約需要20多分鐘,。

設(shè)計(jì)思路

更新中……

核心代碼

更新中……

num_round = 1000                      
n_estimators = cvresult.shape[0]     
print ('running cross validation, with preprocessing function')

# do cross validation, for each fold
# the dtrain, dtest, param will be passed into fpreproc
# then the return value of fpreproc will be used to generate results of that fold
cvresult = xgb.cv(param, dtrain, num_round, nfold=5,     
                  metrics={'ams@0.15', 'auc'},         
                  early_stopping_rounds=10, seed = 0, 
                  fpreproc = fpreproc)                     
print ('finish cross validation','\n',cvresult)   


print ('train model using the best parameters by cv ... ')
bst = xgb.train( param, dtrain, n_estimators )                        
bst.save_model('data_input/xgboost/data_higgsboson/higgs_cv.model')