Differences between StandardNaiveBayesClassifier and ComplementaryNaiveBayesClassifier in mahout

StackOverflow https://stackoverflow.com/questions/20694184

Maybe my question is quite sophisticated, but I would like to know the main differences between StandardNaiveBayesClassifier and ComplementaryNaiveBayesClassifier algorithms in Mahout. Which one performs better on smaller amount of training data or it is data dependent issue? Which one is better for sentiment analysis? And some other aspecs...

Thank you in advance!

有帮助吗?

解决方案

Complement naive Bayes is an naive Bayes variant that tends to work better than the vanilla version when the classes in the training set are imbalanced. In short, it estimates feature probabilities for each class y based on the complement of y, i.e. on all other classes' samples, instead of on the training samples of class y itself.

其他提示

The Compliment Naive Bayes (CNB) classifier improves upon the weakness of the Naive Bayes classifier by estimating parameters from data in all sentiment classes except the one which we are evaluating for.

1) Even though performance of the NaïveBayes is good it makes several poor assumptions such as data independence and the uneven training data for a particular class (skewed data). 2)Complemented Naïve Bayes is one of the NaïveBayes variant which tackles the poor assumptions made by the parent Naïve Bayes classifier such as the Uneven Training size (The most occurring class in training data dominates during actual classification) and the independence (All features or attributes are treated individually) assumptions.Skewed data refers to having more training examples for one class than another which causes the decision boundary weights to be biased. This in turn induces the classifier to unwittingly prefer one class over the other. To counter this problem Complement Naïve Bayes proposes a probability estimate parameter which uses data from all classes except c

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top