Show simple item record

dc.contributorAlquézar Mancho, René
dc.contributor.authorNorte Sosa, José
dc.date.accessioned2011-03-10T15:24:54Z
dc.date.available2011-03-10T15:24:54Z
dc.date.issued2010-08
dc.identifier.urihttp://hdl.handle.net/2099.1/11321
dc.description.abstractMost e-mail readers spend a non-trivial amount of time regularly deleting junk e-mail (spam) messages, even as an expanding volume of such e-mail occupies server storage space and consumes network bandwidth. An ongoing challenge, therefore, rests within the development and refinement of automatic classifiers that can distinguish legitimate e-mail from spam. Some published studies have examined spam detectors using Naïve Bayesian approaches and large feature sets of binary attributes that determine the existence of common keywords in spam, and many commercial applications also use Naïve Bayesian techniques. Spammers recognize these attempts to prevent their messages and have developed tactics to circumvent these filters, but these evasive tactics are themselves patterns that human readers can often identify quickly. This work had the objectives of developing an alternative approach using a neural network (NN) classifier brained on a corpus of e-mail messages from several users. The features selection used in this work is one of the major improvements, because the feature set uses descriptive characteristics of words and messages similar to those that a human reader would use to identify spam, and the model to select the best feature set, was based on forward feature selection. Another objective in this work was to improve the spam detection near 95% of accuracy using Artificial Neural Networks; actually nobody has reached more than 89% of accuracy using ANN.
dc.language.isoeng
dc.publisherUniversitat Politècnica de Catalunya
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Spain
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Sistemes experts
dc.subject.lcshUnsolicited electronic mail messages -- Classification
dc.titleSpam Classification Using Machine Learning Techniques - Sinespam
dc.typeMaster thesis
dc.subject.lemacCorreu brossa (Correu electrònic) -- Classificació
dc.rights.accessOpen Access
dc.audience.educationlevelMàster


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Spain
Except where otherwise noted, content on this work is licensed under a Creative Commons license : Attribution-NonCommercial-NoDerivs 3.0 Spain