Show simple item record

dc.contributorBéjar Alonso, Javier
dc.contributor.authorGibert Llauradó, Daniel
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Ciències de la Computació
dc.description.abstractAccording to AV vendors malicious software has been growing exponentially last years. One of the main reasons for these high volumes is that in order to evade detection, malware authors started using polymorphic and metamorphic techniques. As a result, traditional signature-based approaches to detect malware are being insufficient against new malware and the categorization of malware samples had become essential to know the basis of the behavior of malware and to fight back cybercriminals. During the last decade, solutions that fight against malicious software had begun using machine learning approaches. Unfortunately, there are few opensource datasets available for the academic community. One of the biggest datasets available was released last year in a competition hosted on Kaggle with data provided by Microsoft for the Big Data Innovators Gathering (BIG 2015). This thesis presents two novel and scalable approaches using Convolutional Neural Networks (CNNs) to assign malware to its corresponding family. On one hand, the first approach makes use of CNNs to learn a feature hierarchy to discriminate among samples of malware represented as gray-scale images. On the other hand, the second approach uses the CNN architecture introduced by Yoon Kim [12] to classify malware samples according their x86 instructions. The proposed methods achieved an improvement of 93.86% and 98,56% with respect to the equal probability benchmark.
dc.publisherUniversitat Politècnica de Catalunya
dc.subjectÀrees temàtiques de la UPC::Informàtica::Seguretat informàtica
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadors
dc.subject.lcshNeural networks (Computer science)
dc.subject.lcshComputer viruses
dc.subject.othermalware classification challenge
dc.subject.othermachine learning
dc.subject.otherartificial intelligence
dc.subject.otherdeep learning
dc.subject.otherword embeddings
dc.subject.otherSkip-gram model
dc.titleConvolutional neural networks for malware classification
dc.typeMaster thesis
dc.subject.lemacXarxes neuronals (Informàtica)
dc.subject.lemacVirus informàtics
dc.rights.accessRestricted access - author's decision
dc.audience.mediatorFacultat d'Informàtica de Barcelona

Files in this item


This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder