Kaggle比賽的終極武器：模型融合（Model Ensemble)

01-26

「如果你沒有什麼好的思路的話，那麼就模型融合吧！」

模型融合針是一種非常有效的技術，它可以明顯提升ML任務的表現成績。通過把多個單模型融合在一起，能夠降低bias，variance，控制Overfitting，提高準確率。

下面這篇文解釋了為什麼Ensemble能夠起到這些作用，還介紹了幾種常用的Ensemble的方法： (weighted)vote, averaging, stacking，blending。

Model ensembling is a very powerful technique to increase accuracy on a variety of ML tasks. In this article I will share my ensembling approaches for Kaggle Competitions.

For the first part we look at creating ensembles from submission files. The second part will look at creating ensembles through stacked generalization/blending.

第一部分，我們對預測結果的文件進行ensemble；第二部分，我們通過stacked generalization、blending等方法來實現ensemble。

This is how you win ML competitions: you take other peoples』 work and ensemble them together.」Vitaly Kuznetsov NIPS2014

1. Creating ensembles from submission files

The most basic and convenient way to ensemble is to ensemble Kaggle submission CSV files. You only need the predictions on the test set for these methods — no need to retrain a model. This makes it a quick way to ensemble already existing model predictions, ideal when teaming up.

2.StackedGeneralization & Blending

Averagingprediction files is nice and easy, but it』s not the only method that the top Kagglers areusing. The serious gains start with stacking and blending. Hold on to yourtop-hats and petticoats: Here be dragons. With 7 heads. Standing on top of 30other dragons.

原文鏈接：https://mlwave.com/kaggle-ensembling-guide/

關注微信公眾號：kaggle數據分析，後台回復「ensemble」可獲取文章和代碼。

Kaggle比賽的終極武器： 模型融合（Model Ensemble)

1. Creating ensembles from submission files

Kaggle比賽的終極武器：模型融合（Model Ensemble)