Can I use SVC() as a base_estimtor for ensemble methods?
-
11-12-2020 - |
题
I am currently testing out a few different ensemble methods on my dataset. I've heard that you can also use support vector machines as base learners in boosting and bagging methods but I am not sure which methods allow it or not. In particular, e.g. for XGB i tried out trees and SVMs as base learners and got the exact same result for 5 different performance metrics which made me question the results and/or that the option can only take trees as base learners. I didn't find much info in the documentation or at least not in all of the documentations. I would be interested about AdaBoostClassifier(), BaggingClassifier() and XGBClassifier(). Does anybody know the details and whether or not I can use SVMs here as base learners?
解决方案
In short: Yes.
Conceptually, bagging and boosting are model-agnostic techniques, meaning that they work regardless of the learner.
Bagging essentially is the following:
- create multiple predictors (they can even be hard-coded!)
- gather predictions from the learners and come up with a prediction
Boosting can be seen as:
- train a predictor
- find where the predictor makes mistakes
- put more emphasis on these mistakes
- repeat until satisfactory
Regarding the specific Sklearn implementations, here are the base learners that you can use:
- AdaBoostClassifier()
The documentation says Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes
.
This means that you can use all models that can give weight to your samples as part of the learning process (KNN, SVM, etc.)
- BaggingClassifier()
This is a simple bagging strategy, so all estimators can be used here.
- GradientBoostingClassifier()
This requires that your learners are differentiable so that the gradient can be computed. Generally, this technique is specific for tree learning.