Mostra el registre d'ítem simple
Neural network compression
dc.contributor | Ayguadé Parra, Eduard |
dc.contributor | Llosa Espuny, José Francisco |
dc.contributor.author | Noguera Vall, Ferran |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors |
dc.date.accessioned | 2021-04-30T09:09:23Z |
dc.date.available | 2021-04-30T09:09:23Z |
dc.date.issued | 2021-01 |
dc.identifier.uri | http://hdl.handle.net/2117/344886 |
dc.description.abstract | In recent years, neural networks have grown in popularity, mostly thanks to the advances in the field of high performance computing. Nevertheless, some factors are still limiting the usage of neural networks. In particular, two limiting factors are storage requirements and computational cost. The aim of this project is to radically improve storage demand and provide direction for accelerating the execution of neural networks. In the scope of this thesis two compression algorithms have been developed. These algorithms share a common basis, both exploit error-tolerance is a property, because of this property the weight matrix can be divided into blocks simplifying the problem while merely impacting the accuracy. The first algorithm, groups the weights inside every block using different clustering techniques: Arithmetic mean and K-Means. To decide which clustering method to apply to which block standard deviation is employed among others. The user can specify a trade-off between accuracy and compression. This method has underperformed, obtaining a compression rate of 10,57 for AlexNet, which is not nearly state-of-the-art. The main issue is that meaningless weights are being merged with significant ones, causing a significant drop in the accuracy. The second algorithm, takes on the problem of accuracy loss by pruning all the unimportant weights. After pruning, quantization is applied. For both steps, pruning and quantization, two options have been explored which are effective for different kinds of neural networks. Of the possible combinations between pruning and quantization, one is selected by trial-and-error. The first pruning technique focuses on removing as many weights as possible, while the second pruning method considers blocks to a greater extend. The two types of quantization allow three values per block and five values per block respectively. This algorithm performed very well, obtaining a compression rate of 57,15 for AlexNet with minimal accuracy loss. |
dc.language.iso | eng |
dc.publisher | Universitat Politècnica de Catalunya |
dc.subject.lcsh | Neural networks (Computer science) |
dc.subject.lcsh | Machine learning |
dc.subject.lcsh | Artificial intelligence |
dc.subject.other | aprenentatge profund |
dc.subject.other | compressió de la matriu de pesos |
dc.subject.other | compressió de xarxes neuronals |
dc.subject.other | algoritmes de clustering |
dc.subject.other | quantització |
dc.subject.other | K-Means |
dc.subject.other | mitjana aritmètica |
dc.subject.other | acceleració de xarxes neuronals |
dc.subject.other | consum d'energia |
dc.subject.other | xarxes neuronals convolutionals |
dc.subject.other | capa densament connectada |
dc.subject.other | deep learning |
dc.subject.other | neural networks |
dc.subject.other | weight matrix compression |
dc.subject.other | neural network compression |
dc.subject.other | clustering algorithms |
dc.subject.other | quantization |
dc.subject.other | clustering algorithms |
dc.subject.other | arithmetic mean |
dc.subject.other | matrix compression |
dc.subject.other | AlexNet |
dc.subject.other | LeNet |
dc.subject.other | CIFAR-10 |
dc.subject.other | MNIST |
dc.subject.other | ImageNet |
dc.subject.other | artificial intelligence |
dc.subject.other | convolutional neural networks |
dc.subject.other | fully-connected layer |
dc.title | Neural network compression |
dc.type | Master thesis |
dc.subject.lemac | Xarxes neuronals (Informàtica) |
dc.subject.lemac | Aprenentatge automàtic |
dc.subject.lemac | Intel·ligència artificial |
dc.identifier.slug | 156456 |
dc.rights.access | Open Access |
dc.date.updated | 2021-02-05T07:29:28Z |
dc.audience.educationlevel | Màster |
dc.audience.mediator | Facultat d'Informàtica de Barcelona |
dc.audience.degree | MÀSTER UNIVERSITARI EN INNOVACIÓ I RECERCA EN INFORMÀTICA (Pla 2012) |