A reinforcement learning algorithm for optimalVirtual network functions allocations among 5G edge computing centers

Ruiz de Mendoza, Carlos Ausias

Visualitza/Obre

memoria.pdf (7,264Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Ruiz de Mendoza, Carlos Ausias

Correu electrònic de l'autorcarlos.ruiz.de.mendoza

estudiantat.upc.edu

Tutor / directorGarcía Villegas, Eduard

; Bakhshi, Bahador; Mangues, Josep; Zeydan, Engin

Realitzat a/ambCentre Tecnològic de Telecomunicacions de Catalunya (CTTC)

Tipus de documentProjecte Final de Màster Oficial

Data2021-07-14

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

The fifth generation (5G) mobile networks are enabling operators and stakeholders to enhance and innovate new services in response to an increasing market demand. 5G architecture provides scalability and flexibility for adapting its infrastructure to a customizable communication system by means of Cloudification. Softwarization and virtualization are key terms for upcoming industries that will require ultra-low latency, only possible if the infrastructure equipment that traditionally was centralized in the communication network core is physically moved closer to the user, at the network edge. The main objective of this master thesis was to implement a Reinforcement Learning algorithm (Q-Learning Temporal Difference) aimed at next generation networks to optimally allocate Virtualized Network Functions (VNF) to 5G network Edge Computing (EC) centers. In order to evaluate the algorithm performance and compare it, two more algorithms have been developed to achieve a solution under the same network circumstances. The first one, Best Fit, was inspired by a classical network load balancing algorithm (Weighted Round Robin), whereas the second, MDP, was approached through dynamic programming (Policy Iteration), having posed the network dynamics as a finite Markov Decision Process. The several tests that have been carried out indicate that Q-Learning performs better than the Best Fit and almost as close as the MDP. It shows that the Q-Learning algorithm is able to allocate optimally the incoming VNF demands when EC centers' available resources are somehow restricted.

MatèriesOptical communications, Machine learning, Comunicacions òptiques, Aprenentatge automàtic, Xarxes neuronals (Informàtica)

TitulacióMÀSTER UNIVERSITARI EN APLICACIONS I GESTIÓ DE L'ENGINYERIA DE TELECOMUNICACIÓ (MASTEAM) (Pla 2015)

URIhttp://hdl.handle.net/2117/349626

Col·leccions

Màsters oficials - Master's degree in Applied Telecommunications and Engineering Management -- MASTEAM [109]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
memoria.pdf		7,264Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

A reinforcement learning algorithm for optimalVirtual network functions allocations among 5G edge computing centers

Visualitza/Obre

Explora