A reinforcement learning algorithm for optimalVirtual network functions allocations among 5G edge computing centers
Visualitza/Obre
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/349626
Correu electrònic de l'autorcarlos.ruiz.de.mendozaestudiantat.upc.edu
Realitzat a/ambCentre Tecnològic de Telecomunicacions de Catalunya (CTTC)
Tipus de documentProjecte Final de Màster Oficial
Data2021-07-14
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
The fifth generation (5G) mobile networks are enabling operators and stakeholders to enhance and innovate new services in response to an increasing market demand. 5G architecture provides scalability and flexibility for adapting its infrastructure to a customizable communication system by means of Cloudification. Softwarization and virtualization are key terms for upcoming industries that will require ultra-low latency, only possible if the infrastructure equipment that traditionally was centralized in the communication network core is physically moved closer to the user, at the network edge. The main objective of this master thesis was to implement a Reinforcement Learning algorithm (Q-Learning Temporal Difference) aimed at next generation networks to optimally allocate Virtualized Network Functions (VNF) to 5G network Edge Computing (EC) centers. In order to evaluate the algorithm performance and compare it, two more algorithms have been developed to achieve a solution under the same network circumstances. The first one, Best Fit, was inspired by a classical network load balancing algorithm (Weighted Round Robin), whereas the second, MDP, was approached through dynamic programming (Policy Iteration), having posed the network dynamics as a finite Markov Decision Process. The several tests that have been carried out indicate that Q-Learning performs better than the Best Fit and almost as close as the MDP. It shows that the Q-Learning algorithm is able to allocate optimally the incoming VNF demands when EC centers' available resources are somehow restricted.
MatèriesOptical communications, Machine learning, Comunicacions òptiques, Aprenentatge automàtic, Xarxes neuronals (Informàtica)
TitulacióMÀSTER UNIVERSITARI EN APLICACIONS I GESTIÓ DE L'ENGINYERIA DE TELECOMUNICACIÓ (MASTEAM) (Pla 2015)
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
memoria.pdf | 7,264Mb | Visualitza/Obre |