|
IMPROVING THE PERFORMANCE OF ANTI-SPAM FILTERS USING OUT-OF-VOCABULARY STATISTICSDOI: 10.4067/S0718-33052009000300012 Keywords: spam, filtering, out-of-vocabulary. Abstract: this paper presents a feature based on out-of-vocabulary word statistics that complements the information sources used in the decision by state-of-the-art spam filters. the experiments included freely available spam filters as reference, spamassassin, bogofilter, spambayes and spamprobe, as well as a naive bayes classifier. the results show that the decision based on the proposed feature improves the performance of all spam filters under study.
|