Boosting trees for anti-spam email filtering (Extended version)

Carreras Pérez, Xavier; Màrquez Villodre, Lluís

Visualitza/Obre

R01-44.ps (566,9Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Carreras Pérez, Xavier

Màrquez Villodre, Lluís

Tipus de documentReport de recerca

Data publicació2001-10

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

In this work, a set of comparative experiments for the problem of automatically filtering unwanted electronic mail messages are performed on two public corpora: PU1 and LingSpam. Several variants of the AdaBoost algorithm with confidence-rated predictions (Schapire et al., 99) have been applied, which differ in the complexity of the base learners considered. Two main conclusions can be drawn from our experiments: a) The boosting--based methods clearly outperform the other learning algorithms results published on the two evaluation corpora, achieving very high levels of the F_1 measure; b) Increasing the complexity of the base learners allows to obtain better high-precision classifiers, which is a very important issue when misclassification costs are considered.

CitacióCarreras, X., Marquez, L. "Boosting trees for anti-spam email filtering (Extended version)". 2001.

Forma partLSI-01-44-R

URIhttp://hdl.handle.net/2117/97841

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
R01-44.ps		566,9Kb	Postscript	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Boosting trees for anti-spam email filtering (Extended version)

Visualitza/Obre

Explora