Deep Learning at the Mobile Edge: Opportunities for 5G Networks

: Mobile edge computing (MEC) within 5G networks brings the power of cloud computing, storage, and analysis closer to the end user. The increased speeds and reduced delay enable novel applications such as connected vehicles, large-scale IoT, video streaming, and industry robotics. Machine Learning (ML) is leveraged within mobile edge computing to predict changes in demand based on cultural events, natural disasters, or daily commute patterns, and it prepares the network by automatically scaling up network resources as needed. Together, mobile edge computing and ML enable seamless automation of network management to reduce operational costs and enhance user experience. In this paper, we discuss the state of the art for ML within mobile edge computing and the advances needed in automating adaptive resource allocation, mobility modeling, security, and energy efﬁciency for 5G networks.


Introduction
By 2024, 5G mobile edge computing (MEC) is expected to be a multi-million-dollar industry with enterprise deployments reaching $73M [1]. Each year, the complexity of data continues to grow. The rise of network complexity systems stems from the increase of on-demand and customizable services. Internet service providers must accommodate traffic for web browsing, connected vehicles, video streaming, online gaming, voice over IP and always-on Internet of Things (IoT) device transmissions. New constraints introduced by on-demand services as listed above require a radical transformation of fixed and mobile access networks.
Fifth-generation (5G) mobile networks are being developed to serve the increasing levels of traffic demand and diversity. To cope with the complex traffic demanded by modern users, network operators are adopting cloud-computing techniques. 5G networks will use software-defined networks (SDN) and network function virtualization (NFV) to reduce the operational cost of growing mobile networks to provide on-demand services. Long-term, end users can expect performance enhancements because 5G is optimized to provide low-latency, high-availability, and high-bandwidth communication for multiple use cases, including delay-sensitive applications, such as autonomous vehicles and automated Industry 4.0 robotics.
The stringent functional requirements for 5G networks have forced designers to rethink the backbone and access network architectures to better support core functions and dynamic network services. The introduction of mobile edge computing disrupts the traditional separation between the access network (secure and reliable transport between end users) and the core network (information computing and storage). The combination of mobile edge computing and cloud computing blends all obstacles towards the full automation of mobile edge computing. We were motivated to carry out this survey to investigate the potential and challenges introduced by deploying deep learning at scale at the mobile edge.

Contributions
The contribution of this document is three-fold: 1) Create a taxonomy of distributed computing and storage resources that involve connected the edge of the network with end users and devices, 2) discuss the application of deep learning to edge computing to meet the functional requirements of 5G networks, and 3) provide an overview of new applications, standardization efforts, and challenges that have arisen from introducing deep learning into mobile edge computing in 5G networks.
In this paper, we explain the work undertaken and the challenges in applying the power of deep learning in 5G mobile edge computing to serve low-latency, real-time applications by providing adaptive, application-specific resource allocation, and security, and accommodating high user mobility. With DL, mobile edge computing can drive 5G networks to meet the stringent requirements imposed by a wide range of applications, such as real time, security and energy efficiency in the industry environment.
The rest of the paper is organized as follows: Section 2, presents an overview of 5G and mobile edge computing enabling technologies. Section 3 offers a background to deep learning (DL) techniques commonly used in network management. Section 4 discusses the current work and addresses open issues in mobile edge computing that could be solved by further interdisciplinary ML work. Section 5 also provides an overview of protocols and architectures recently designed to automate network management with ML. Section 6 discusses state-of-the-art applications and use cases that mobile edge computing hopes to enable, including autonomous vehicles, industrial robotics, and massive IoT scale-up. The paper concludes in Section 7.

Mobile Edge Computing
This section is divided into two parts. In the first part, the three main functional requirements of the 5G network are introduced to show that its full deployment requires computing, storage, and network infrastructure close to the user and the infrastructure, whether fixed or mobile, of the end user. The second part introduces a mobile edge computing taxonomy, clarifying the functionalities and the geographic areas that edge computing covers to demonstrate the essentiality of mobile edge computing in 5G deployments.
At the same time, EC is shown to be a cornerstone of 5G deployment. Addressing the rapidly changing Internet demand requires rethinking network and information delivery designs. A combination of newly developed 5G networks and mobile edge computing (MEC) will enable Internet service providers (ISPs) to meet consumer demands.

5G Network Purpose and Design
Each generation of mobile network standards has been designed in response to the changing use of mobile communications. 4G and LTE networks enhanced capabilities beyond 3G support for simple mobile browsing and messaging systems. Similarly, 5G networks have been designed with three main goals to improve network performance for the next decade (see Figure 1): 1. Enhanced mobile broadband (eMBB) will support general consumers applications, such as video streaming, browsing, and cloud-based gaming. 2. Ultra-reliable low-latency communications (URLLC) will support latency-sensitive applications, such as AR/VR, autonomous vehicles and drones, smart city infrastructure, Industry 4.0, and tele-robotics. 3. Massive machine-type communications (mMTC) will support scalable peer-to-peer networks for IoT applications without high bandwidth.
5G networks accomplish the high bandwidth, high availability, and low-latency requirements of new Internet services and applications through the adoption of cloud-computing infrastructure. Cloud providers use software-defined networks (SDN) and network function virtualization (NFV) to boost the creation of services to facilitate multi-tenant and multi-service infrastructure. The move to SDN infrastructure enables the replacement of proprietary hardware and software for network functions like routers or firewalls with cheaper, standardized, and re-programmable virtual customer premises equipment (vCPE). NFVs are centrally controlled within the cloud. Virtual network functions (VNFs), as one of the key functionalities of NFV, such as load balancers, can run on any generic server to allow the network to scale up resources on demand and be migrated between different parts of the network. However, data processing in large remote cloud data centers based on SDN/NFV functionalities cannot meet the low latency required for real-time data analytics in autonomous vehicles or locally based augmented/virtual reality (AR/VR).
The increased communication taking place on smartphones, tablets, wearables, and IoT devices can congest the core network communicating the centralized cloud servers. Duplicate requests for popular videos during peak streaming times can overwhelm core networks and lead to a low quality of experience (QoE) for users and costly inefficiencies in network resource usage [2]. Mobile edge computing can help solve these challenges in 5G by creating a decentralized cloud at the network edge.

Mobile Edge Computing for 5G
Various models of computing operate in the network environment, including mobile computing, cloud computing, fog computing, and edge computing. A taxonomy of the network computing paradigm is detailed in [7].
Mobile computing (MC) creates an isolated, non-centralized, network-edge, or off-network environment made up of elements (mobile devices, IoT devices, etc.) that share network, computing, and storage resources. However, cloud computing offers on-demand ubiquitous computing resources. These computing services can be public, private, or hybrid, and they use various payment-for-use mechanisms.
Edge computing (EC) is a system that offers networking, computing and storage services near the end devices, and they are generally located at the edge of the network. This system, which takes the shape of a mini data center, has high availability and can offer low-latency services, but it has computing and storage resources with lower features than cloud computing.
Mobile edge computing (MEC) combines the functions of mobile computing with edge computing. The edge computing infrastructure is complemented by the resources of mobile or IoT devices with low-consumption computing and storage hardware, and non-permanent or low-reliability communications. The mobile edge computing system has been extended and standardized by the European Telecommunication Standards Institute (ETSI), which coined the term multi-access mobile Figure 2. Illustration of the application area of the main network computing models focused on the edge. Mobile Edge Computing (MEC) enables cloud-based data for real-time applications and ultra-reliable services that need to be stored closer to end users with edge nodes and base stations. Mobile edge computing empowers AI-based services like navigation and connected vehicles that require large amounts of locally relevant data computation, management and data analysis.
The choice of edge device depends on the application of mobile edge computing. For example, 5G base stations can be used to assist vehicle-to-vehicle communication for autonomous driving or RAT aggregation sites can be used for delivering locally relevant, fast services to dense public locations, such as stadiums or shopping malls. Several industry cloud providers have already developed software and hardware solutions to enable mobile edge computing in edge devices (Microsoft Azure's IoT Edge [17], Google's Edge TPU [18], or Amazon's IoT Greengrass [19]), thereby establishing mobile edge computing as a relevant technology for next-generation networks and content delivery. Layering ML on top of mobile edge computing infrastructure enables automation of efficiency-enhancing network functions at the edge. Together, ML and mobile edge computing can enable real-time applications with low-latency cloud computation.

Deep Learning Techniques
Machine learning systems use algorithms that improve their output based on experience. In the future, machine learning will replace traditional optimization methods in many fields because ML models can expand to include new restrictions and inputs without starting from scratch and they can solve mathematically complex equations. ML models are readily adapted to new situations, as we are currently witnessing with computer systems.
In the last decade, a subset of machine learning called deep learning (DL) has garnered much attention in computer vision [20,21] and has discovered new optimal strategies for games [22,23] without the costly hand-crafted feature engineering previously required. Deep learning uses neural networks to perform automated feature extraction from large data sets and then use these features in later steps to classify input, make decisions, or generate new information [24].
Research on deep learning for computer vision exploded after the release of ImageNet, a curated database of over 10 million images across 10,000 categories, which helped train ML image classification models [25]. Because 5G implementations are new and deployed in select regions, there are few representative data sets for 5G network traffic and many authors rely on simulations. 5G and slice-based networking may change the models of service demand. However, few authors have representative and detailed data from telecommunication companies [26,27] because of the risk of leaking proprietary or customer information.
Combined with the adaptation of SDN/NFV techniques within 5G networks, deep learning presents an opportunity for accurate identification and classification of mobile applications and automates the creation of adaptive network slicing [28], among other possibilities. Figure 3 shows examples of four common deep learning models, which are explained in the following subsections.

Deep Neural Networks
Deep neural networks (DNNs) were developed to parametrically approximate functions that map a vector of input data to an output vector (e.g., an image to the percentage probability that it is in each of five classes labels) [24,29]. Expanded from original simple perceptron systems, deep neural networks are feed-forward systems that use many hidden layers and internal functions to approximate non-linear relationships between input and output. Each DNN model uses gradient descent to minimize a cost function (mean square error, maximum likelihood) in an optimization and training process called "back-propagation". Cost functions for 5G networks could also include minimizing operating costs, latency, or downtime. Within 5G networks, deep learning has already been applied to themes such as traffic classification, routing decisions, and network security [30].
Convolutional neural networks (CNNs) are specialized DNN models to appropriately handle large, high-resolution images as inputs [24,31]. CNNs exploit the relationship in nearby data (such as pixels in an image or location-based measurements). CNNs use mathematical "convolutions", linear operations that compute a weighted average of nearby samples, and "pooling" to summarize the data in a region typically with max-pooling or a similar operation depending on the aim [24]. CNNs have been used in 5G networks to predict mobility based on traffic flows between base stations [32] and for object classification applicable also to 5G-enabled industry robotics [33].
Recurrent neural networks (RNNs) scale the DNN models for function approximation to handle temporal sequences, where the output from a previous time step influences the decision the network makes for the next step [24]. RNNs require the use of a "memory" to recall information learned in previous time steps in addition to the current input to the model. Performing the gradient descent-based training on RNNs caused issues with "exploding gradients", which was corrected by new ML models called long-short term memory (LSTM) models with additional information flow structures called "gates". LSTM models have proven useful and accurate in solving traffic prediction [34] and mobility [35] problems within communication networks.

Reinforcement Learning
Reinforcement Learning (RL) has been lauded in the last decade for training ML systems to outperform humans, culminating in DeepMind's development of a winning machine even in the complex and large game space of Go [22,36]. RL is incredibly powerful because these training steps for the model do not require any prior knowledge of the rules of the games played, but rather, they optimize the model for future rewards using an "agent" who makes observations about its environment (pixels, game state, sensor inputs, etc.) and the rewards (points, coins, closeness to end goal) received in response to the actions (turning, moving, shooting, etc.) it makes. The agent determines which action to perform based on a "policy", the output of a neural network trained based on "policy gradients". In RL, discount rates can be applied to increase or decrease the importance of immediate versus projected long-term rewards when determining the optimal next action.
Q-learning is a subset of RL where models are built on Markov Decision Processes, stochastic processes where the next state is independent of previous state changes [37]. Q-values estimate the optimal state-action pairs and selects the action with the maximum reward value [36]. Q-learning is promising because even with imperfect or no information on the underlying Markov Decision Process, the transition-probabilities, the agent explores the states, rewards, and possible action pairs through greedy exploration policies that favor exploring unknown and high-reward regions. The optimal decision policy for actions to take can then be obtained with very little prior information. The agent then takes the action with the maximum Q-value calculated by the policy. Deep Q-learning networks can be used to model situations with large state spaces (e.g., Go) using DNN to avoid feature engineering to train the policy and a set of replay memories to continue using information learned in previous steps [36].
Deep Q-learning networks (DQN) are especially adaptable to open issues within the 5G sphere. Mobile networks are increasingly dynamic where the number of apps, users, and topology of the network have become increasingly ad-hoc. The ML systems used to approximate solutions for these networks must be equally flexible. DQNs could be applied here because they discover new optimal policies after observing additional situations without requiring the model to be completely retrained. Several teams have already used deep Q-learning to address ad-hoc mobile edge computing vehicular networks for 5G [30].

Enabling ML in the Mobile Edge
Performing ML within edge devices can take advantage of contextual data available such as cell load, user location, connection metadata, application types, local traffic patterns, and allocated bandwidth. Latency for responses from traditional cloud-computing centers over the wide-area network hinders network and services key performance indicators (KPI). In addition, performing ML tasks at the edge can reduce the load on the core network. To take full advantage of the mobile edge computing and ML collaboration benefits, ML models must be designed to use minimal resources and still obtain useful and accurate results as they are applied to scale across expansive communication networks.
Currently, ML training and inference tasks within mobile edge computing are partially inhibited by comparatively smaller storage capabilities and limited power supplies in edge devices than those found in industrial cloud data centers. In response, ML within the mobile edge computing has been enabled by two main enhancements: 1. Efficient ML models specialized to require less energy, memory, or time to train, and 2. Distributed ML models that distribute the training and inference tasks between large data centers and smaller edge devices for parallel processing and efficiency.

Efficient ML Models
Currently, ML models require abundant memory to store training data and computational power to train the large models. Novel ML models have been designed to operate efficiently on edge devices by employing shallow models that require low enough processing power that they can be used on IoT devices [38]. Alternatively, reducing the size of the model's inputs for classification applications can increase the speed of learning and convolutions on edge devices when less granular decisions are required [39]. The computational requirements for ML model training can be further reduced by early exiting in models designed with multiple exit points for achieved learning results [40,41] or designed in human-machine collaboration using CNNs based on existing designs by experts to explore and design new efficient ML architectures [42]. However, model redesign is only the first step to achieving efficient ML in mobile edge computing.

Distributed ML Models
DNN is a widely adopted ML technique, but the full burden of training a DNN model is too intensive for a single resource-constrained device at the mobile edge. Distributed ML models are well-adapted to mobile edge computing because the work is distributed across many computing centers in the network (cloud, base stations, edge nodes, end devices) [43] to collectively train the DL model by giving them each a small portion of the work to perform and then combining the results [44,45]. Sub-tasks for the training can be allocated based on the edge device's resource constraints and distributed work stealing models that prioritize load balancing in inference tasks [46].
Within mobile edge computing, distributed learning aims to use multiple smaller edge devices rather than one large data center. Distributed DNNs are composed of both local and global parameter adjustments during the learning processes combined with global aggregation steps in the cloud to achieve a single well-trained model [43,47]. Optimization of the aggregation step can include methods, such as federated drop out, prioritized local updates, fast convergence, and compression [43,44] while local learning can be optimized using efficient ML models as described in the previous section.
However, distributed learning can introduce new challenges. In [48], the authors found that the latency and cost of sharing the learned gradients between devices constituted a bottleneck during the training processes. To overcome the communication bottleneck, gradients used during the back-propagation process of model training were compressed to reduce bandwidth requirements and redundancy. Additional efforts to selectively share only the important gradients in the training process have reduced communication costs with minimal impact on accuracy [49] and help reduce core network traffic and memory footprint on resource-constrained devices.

Challenges of DL for 5G Operations at the Mobile Edge
This section introduces a challenges taxonomy of applying DL at the Edge in 5G networks, which is the basis of this survey. Figure 4 shows this taxonomy, which categorizes the research articles that focus primarily on applications of deep learning techniques used in network operations discussed in this paper.
In this section, we describe how deep learning has been applied to solve operational issues at the mobile edge. Mobile edge computing (MEC) has the advantage of proximity to users, which can meet the low latency (URLLC), high bandwidth (eMBB), and high availability (mMTC) goals of 5G networks by leveraging the breakthroughs discussed in the previous section.
5G networks present interesting challenges best addressed at the mobile edge to reduce latency and incorporate locally significant information. Mobile edge computing can leverage proximity to user to address a variety of challenges in 5G networking in particular which often require automated management using DL for increasingly complex series of tasks.
Solutions combining DL for 5G promise better efficiency when conducted near the end user in mobile edge computing rather than in the core network. For instance, mixing mobile edge computing with 5G networks seamlessly connects existing cloud computing with edge computing to enable novel applications The potential applications of DL within the networking domain are many, but this paper focuses on a few key areas: 5G slicing using traffic prediction within the mobile edge computing, adaptive resource allocation to meet user demand in real time, predictive caching to reduce latency, task offloading from nearby end devices, meeting service quality guarantees, efficient energy usage, data security and privacy, network architectures, standards, and automation. Though these may appear to be separate tasks, they are intricately connected.
Several challenges remain before DL will be fully applicable in the 5G mobile edge, many of which are relevant in machine learning systems at large. Table 1 provides an overview of the research discussed in this paper. Readers can view an accessible summary of the methods and key outcomes of past research. Table 1. Some relevant applications of deep learning techniques used in network operations discussed in this paper. Some papers could belong in multiple categories because the themes in automated network management overlap, but for simplicity are only listed once.

Topic
Paper DL Model Purpose/Methods [27] DNN Uses spatio-temporal relationships between stations to predict future traffic demand patterns [50] RNN Predicts base station pairing for highly mobile users to prepare for future demand [35] RNN Minimizes signaling overhead, latency, call dropping, and radio resource wastage using predictive handover [26] RNN Predicts traffic demand by exploiting space and time patterns between base stations [28] DNN Identifies real-time traffic and assigns to relevant network slice [54] RL Offers slicing strategy based on predictions for traffic and resource requirements [56] RL Maximizes network provider's revenue through automated admission and allocation for slices [57] RL Provides automated priority-based radio resource slicing and allocation [53] RL

Mobility Modeling
User mobility prediction is necessary to achieve accurate traffic prediction. With the rise of mobile phone usage and connected vehicles, predicting mobility becomes an important step in understanding mobile network demands. Mobility models can be developed by considering different environments, such as urban [27] or highway patterns to predict the next station a user will likely connect to [50] in order to reduce costs for operational tasks such as handover [35]. Once the mobility patterns of users in a network are understood, then DL can also be used to predict traffic patterns and create more cost-efficient network operation schemes. For example, the expected demand for a base station can be predicted according to the spatial and temporal relationship it has to nearby stations [26]. Other studies apply LTSM to predict the position of the UE along time [51].

Slicing
Slicing is the method by which network providers can create multiple, independent virtual networks over shared physical infrastructure for 5G networks. While traditional mobile networks treat all incoming traffic similarly, 5G network slices can provide customized network services and scale up or down as the demand for that slice changes. Slicing is made possible by SDN and NFV; the separated control plane is composed of ready-for-change software that facilitates adaptive and intelligent network management. 5G networks providers will create customized slices based on use cases (video, IoT, Industry robotics, etc.) created to meet the unique service-level agreements (SLA) [52] (see Figure 5).
5G networks can pair DL and data collected with mobile edge computing to automatically manage slice creation. Automatically spinning up resources for network slices first requires predictions of network demand and user location to assign resources correctly at edge nodes. To achieve useful traffic prediction, it is necessary to predict user mobility and, the demand for network resources, as well as be able to classify traffic origin in real time to assign to the correct slice. Figure 5. Physical infrastructure and virtual resources (network, computation, storage) in service chaining for 5G slices. Each slice has a specific purpose and type of end device, but several slices may use the same types of virtual network functions (VNFs) to deliver services to their users. The VNFs for each slice are separated for privacy and security.
The research conducted in [26] set the groundwork for this field by using the LSTM model and RNN to analyze real-world cellular data from China to understand the relationship between traffic at separate cell towers. The technique was improved by using social media and other data sources to analyze the effect of key events in a city, such as sporting games, and how this affects the network demand [27]. From here, DL can be used to classify traffic without privacy-invading techniques, such as packet inspection or strict classification based on ports or packet signatures [28]. Once the traffic type is understood, network operators can take advantage of network virtualization to create E2E slices per application and dynamically meet each SLA independently [54], while still achieving optimal resource usage.
Complete network slicing requires allocating virtual resources to a subset of traffic and isolating these resources from the rest of the network. Predicted demand influences which and how many resources are allocated per slice [55] and determines whether new users are permitted to join the network at the given station. These decisions are based on forecasts for available resources within the slice [56] in a process built on "admission control", which aims to increase revenue through efficient resource usage. DL can give some slices priority over others [57] and adapt the slicing decisions in anticipation of dynamic service demands to maximize resource use [53]. This task is complicated as unknown devices continue to join the network, but with the aid of deep learning, even these can be automatically assigned to slices to balance the network load and improve slice efficiency [103]. Maintaining efficient resource usage requires ongoing resource allocation between and within slices as discussed in the next section.

Resource Allocation
Resource allocation in mobile edge computing is the task of efficiently allocating available radio and computational resources to different slices based on their requirements and priority. Historically, resource allocation was a reactive step aimed at self-healing and fault tolerance. Proactive resource allocation with DL can reduce the effects of costly mistakes for under-provisioning slices that causes SLA violations, poor user experience, and customer churn. To reduce operational costs, deep learning techniques, borrowed from image processing, are used to anticipate network capacity based on metrics gathered in the mobile edge computing, such as signal quality, occupied resource blocks, and local computation loads [32]. Inter-slice resource allocation can be achieved by jointly optimizing according to slice priority (a slice providing remote healthcare could have priority over one for video streaming), under-stocking, fairness concerns [58], and QoE per slice [59].
Resource allocation using DL is especially useful for predicting resources needed for tasks offloaded to the mobile edge network from smaller devices [60] and proactively assigning some of the limited available resources. Because mobile edge computing can exploit local views of wireless signal and service request patterns, resource allocation models operating with this information can be used to further minimize delays [61] and respond in real time to observed changes. DL models can also be applied beyond resource optimization for a single edge node by including both spatial and temporal connections among data dependencies between traffic nearby edge nodes to predict how these dependencies and dynamic regional traffic patterns will affect resource demands [26,27].

Caching
Mobile edge computing can exploit proximity to the user to cache locally relevant data nearby to reduce latency and adapt to surges in popularity for certain content in a region. Employing DL to develop a proactive caching strategy has been shown to improve network efficiency and alleviate demand for radio resources by storing popular content closer to the user than regional datacenters [62,63]. Effective caching consists of two fundamental steps: predicting content requests using popularity estimates and allocating content among edge nodes.
The popularity of content influences how often and in which regions of the network users will request the content from the cache. More common content should be cached to avoid delayed retrieval from regional data centers and reduce traffic on the core network. Some studies have used user mobility and behavioral patterns to predict application choices and develop DL caching strategies to anticipate their desired content [64]. Which contents are placed in mobile edge computing caches can be optimized using DL to increase cache hit rate and decrease delays experienced by users. By using popularity models in conjunction with observed requests, an optimal cache placement strategy can be developed using DL technique such as Q-learning [65,66]; even as users move around, their desired content is more likely to be in a nearby cache.
Content popularity is necessarily dynamic (based on time of day, cultural events, or trending content) and cache content must be updated with frequency. Partial cache refreshes based on DNN provide online responses to changing popularity [67], and the content of groups of edge nodes within close proximity can be updated through joint optimization to reduce cache redundancy [68].
In [69], the authors propose an effective approach for collecting globally available resource information through a mobile network architecture based on the SDN. To minimize network latency, they designed an optimal caching strategy that consists of a small-cell cloud and a macro-cell cloud, reducing considerably the latency compared to conventional caching strategies. Figure 6 shows the main steps for the predictive caching using ML in mobile edge computing compared with the traditional procedure.

Task Offloading
Due to proximity to the users and the potentially high number of stations for 5G networks, these small cell stations can be used to offload tasks that are too computationally intensive or battery-consuming for most users' mobile devices. DL systems can be trained in mobile edge computing systems to minimize the cost of offloading tasks in vehicular networks [70] and small wireless devices [60] by using both immediate and long-term rewards during the training stage. These ask offloading systems respond to changes in real-time demand for computation and supporting resources at the mobile edge computing nodes. As the scale of task offloading initiatives increases until each edge node is simultaneously receiving and running computational requests, the scheduling objective becomes almost intractable without machine learning techniques. Additional intelligent systems have been designed to simultaneously minimize costs of energy, computation, and delay by Figure 6. These two images show the difference between traditional caching and predictive caching using machine learning in mobile edge computing. In step 1 of predictive caching, the most popular contents that match user's predicted preferences according to their profile are downloaded from the cloud to the edge node. In the second step, when the user requests a specific content, there is a higher probability that the desired content has already been downloaded to the edge node previously, increasing QoE.
exploiting DQNs to schedule AR/VR offloading tasks [71] and then rely on additional DL techniques to reallocate resources at the mobile edge in real time. Beyond cost minimization, task offloading schemes can also be trained to minimize delay or maximize users' quality of experience /service (QoE/QoS) [70].

URLLC and eMBB through Quality of Service Constraints
Quality of Service (QoS) measures the performance experienced by the end users of the network. Common QoS metrics include bandwidth, latency, and error rate, among other parameters, changes that can severely alter the network's ability to provide critical services. Every user connects to the network with a set of QoS requirements, which may be more stringent for latency-sensitive applications, such as on-demand video streaming, and voice over IP (VoIP). Meeting QoS and SLA agreements can be integrated as a goal in the DL systems for automated resource allocation in 5G mobile edge computing [59] by requiring chosen allocation schemes to maintain network operations within QoS ranges [73]. For mobile edge computing applications, such as robotic systems, resource allocation systems based on DL and QoS metrics must capture the ultra-low-latency requirements for feedback signals between end devices [74]. This ensures both safety and Quality of Experience (QoE). Furthermore, QoE, as it relates to QoS, can be optimized based on user similarity within groups or geographical regions to dynamically allocate resources according to group needs [75].
Achieving ultra-reliable and low-latency communication (URLLC) is one of the major challenges in 5G networks. This type of service offers a wide variety of challenges, such as QoS requirements [77], strict handovers for uninterrupted service [78,79], power consumption in UEs battery [80], etc. In addition, the coexistence of eMBB and URLLC with different service requirements is also a challenge [76].
The article [69] suggests a novel network architecture using a resource cognitive engine and data engine to address the problem of achieving ultra-low end-to-end delay for the ever-growing number of cognitive applications. The resource cognitive intelligence, based on network context learning, aims to attain a global view of the network's computing, caching, and communication resources.
It is quite a challenge for 5G networks to meet URLLC specifications and this will entail major changes to the system architecture of the existing telecom infrastructure. While current user requirements are initially based on high bandwidth, it is also expected that latency and reliability will play a vital role in real-time applications and mission-critical networks [81].

Energy Consumption
First, training DL models requires high-energy consumption. Even as training speeds have improved, the level of energy consumption for DL models remains high [82]. While some studies have attempted to estimate energy costs [83] and develop algorithms that increase the energy efficiency of the systems they manage [84,85], few efforts have combined these two critical topics. There are still many open questions about the deployability of large-scale DL models with resource-constrained mobile edge computing systems.

Security and Privacy
Two main concerns that have deterred the deployment of large-scale DL systems at the mobile edge computing are the rightful concerns regarding data security and privacy with collected data. Security must be guaranteed when working in 5G mobile edge computing because of the necessary sharing of physical infrastructure between slices and the potential for information leaks about data or usage patterns. As NFV is deployed, isolation between virtual machines and slices must also be guaranteed to promote privacy and reduce performance interference [89]. 5G mobile edge computing networks must also protect themselves from malicious actors, and can use DL to detect and protect against attacks [86,87], though new slicing infrastructure and virtualized networks may require deviation from industry-standard security techniques. Efforts to enhance DL performance should also be built with privacy in mind [88]. Network providers must investigate for any privacy violations in the collection or use of user data before large-scale ML systems are deployed, especially in the case of smart cities or personal electronics, such as connected cars and IoT, which can reveal intimate details about the public as a whole.

Standards towards 5G Automation
Developing end-to-end automated management of 5G architecture and services and the integration of the mobile edge computing into 5G introduce new requirements. The set of multi-access applications and services at the edge designed to meet these requirements is greatly increasing the complexity of managing the networked system. This complexity manifests itself in different aspects of the network and services, such as the provision and operation of services, predictive analysis, real-time monitoring, analytics, or maintenance of thousands of entities, among others, and inexorably forces an end-to-end network and services automation. The application of ML in mobile edge computing and 5G, which will allow self-configuration, self-optimization and self-healing, will also require component standardization.
Several organizations, including the 3GPP, ETSI or ITU, have created working groups to address this problem of standardization and complexity, generating the first architectural standards and models for 5G.
ETSI Industry Specification Groups (ISGs), ENI (Experiential Networked Intelligence), and SAI (Security AI) are working in parallel with ITU-T's Q20/13 and FG ML5G (Focus Group on Machine Learning for Future Networks including 5G), and 3GPP TR 23.791 to incorporate ML in 5G and future networks, from the edge to the core network.
In this section, we discuss the most relevant standards, network and services architectures developed by the main standardization organizations and open forums, which enable the application of ML in mobile edge computing within the framework of 5G.

ETSI
ETSI is a European Standards Organization (ESO) that deals with electronic communications networks and services. It is a partner in the international Third Generation Partnership Project (3GPP) and developed thousands of standards for mobile and Internet technology since 3G networks. With 3GPPP and the guidance of specialists in ISGs for NFV, mobile edge computing, and ENI, ETSI has created standards to develop automated and cognitive services based on real-time user needs, local environmental conditions, and business goals [92,93].
As researchers and technologists work to automate networks through DL, they must bear in mind the growing body of standards that will guide best practices for security, efficiency, and consumer experience in future networks. The main 5G management standard is ETSI's MANO (management and orchestration) architecture for NFV to simplify the roll-out of network services and reduce both deployment and operational costs. The three core functional blocks of MANO are: 1. NFV Orchestrator that controls network services onboarding, lifecycle management, resource management including capacity planning, migration and fault management, 2. VNF Manager configures and coordinates of VNF instances, and 3. Virtualized Infrastructure Manager (VIM) that controls and manages the physical and virtual infrastructure, i.e., computing, storage, and network resources.
Each functional block for MANO presents opportunities for meaningful DL implementations in highly virtualized 5G networks.
The mobile edge computing and NFV architectures proposed by ETSI are complementary. Mobile edge computing and VNF applications can be instantiated on the same virtual infrastructure, in fact NFV sees the mobile edge computing as a VNF. The mobile edge computing consists of various entities that can be grouped into the mobile edge computing system level, the mobile edge computing host level, and networks. The mobile edge computing supports different network infrastructures including those proposed by 3GPP for 5G in particular. Mobile edge computing can be one of the cornerstones of 5G at the edge. The 5G system designed by 3GPP makes it easier to deploy user plane functions on the edge of 5G network. The Network Exposure Function (NEF) in the Control Plane shows the capabilities of network functions to external entities.
One of the features supported by the mobile edge computing is 5GcoreConnect, which interchanges notifications between the 5G Network Exposure Function in the control plane or other 5G core network function. This feature allows the mobile edge computing to receive or send traffic, change the routing policy, or perform policy control. Mobile edge computing can use the shared information for application instantiation to manage the selected mobile edge computing host, select the mobile edge computing host, or perform various functionalities between both systems. ETSI ISG mobile edge computing is focused on the management domain, some of its use cases being cognitive assistance, optimization of QoE and resource use or smart reallocation of instances, among others. The strict features of these use cases make a standardization framework essential for the application of ML in domain management.
To continue progress in connecting mobile edge computing and DL implementations for 5G, ETSI's ENI has developed models in which machine learning techniques can replace manual orchestration or traditional static policies in MANO architecture [94]. Use cases identified by ENI align with the goals in 5G research currently, and if properly realized, can even address some open issues discussed in Section 4, such as security and energy usage. ENI functional blocks include knowledge representation and management, context-aware management, situational-aware management, policy-based management, and cognition management. Using these functional blocks, 5G networks can apply the fundamentals of zero-touch network and service management (ZSM) to become self-configuring, self-optimizing, and self-healing. Future mobile edge computing architectures will take advantage of the information supplied by self-organizing networks (SONs) proposed in the 3GPP SA5 working group and the 3GPP technical report about "Study of Enablers for Network Automation for 5G (Release 16)", to study and specify how to collect data and how to feedback data analytics to the network functions [95,104].
An ETSI ISG produced a report for the organizations developing ZSM [96]. The report summarizes the main activities and architectures developed by standardization bodies, open-source organizations, and industry associations around ZSM. Figure 7 shows the ETSI MANO framework with potential data sources and solution points for DL enhancements following ENI guidelines for automated management.

ITU
The ITU-T Focus Group on Machine Learning for Future Networks including 5G (FG-ML5G) was established in 2017. This group drafts technical reports and specifications for ML applied to 5G and future networks, including standardization of interfaces, network architectures, protocols, algorithms and data formats [97].
The output of the FG-ML5G group includes ITU-T Y-series recommendations that provide an architectural framework for ML in future networks and use cases [98][99][100][101].

IETF
IETF produces open technical documentation to improve the design, use, and management of the Internet. In collaboration with the larger network engineering community, IETF produced an Internet draft to specify how ML could be introduced to distributed system pipelines with Network Telemetry and Analytics (NTA) and Network Artificial Intelligence (NAI). NTA uses telemetry and historical data to perform ML-based analytics for detection, prescription, and prediction in networks through closed loop control via SDN [102].
NTA and NAI are part of the larger effort to add intelligent management to Internet systems. Both architectures are designed to perform real-time analytics for traffic engineering and monitoring alongside existing protocols, such as BGP. Key performance indicators, such as CPU performance, memory usage, and interface bandwidth, can be used to diagnose network health even in multi-layer and virtualized environments.
IETF's Internet draft also describes intelligent service function chaining (SFC), a key task in 5G networks that allows network services to be automatically composed of several more basic network functions. Together with application analytics and intelligent SFC, network operators could experience enhanced performance and control across locale and connection points.

Automation
In 5G, machine learning has been considered a useful tool to automate the network operation and management, including control and management of network slicing, service creation and orchestration, security, mobility management, etc. In [90], the authors discuss the applicability of ML to enable the 5G slicing functions to be executed autonomously. A framework is presented in [91] for the operation and control of network slices by continuously monitoring the performance, workload, and resource use, and dynamically adjusting the resources allocated to the slices.
In [105], the authors present a system of orchestration and control of E2E network slices based on service and resource modeling software that allows for custom business design and software design. They assert that applying ML in large-scale systems will yield advantages, such as better efficiency and faster integration in network management automation.

Implementations and Use Cases
This overview of the challenges and opportunities in intelligent mobile edge computing for 5G networks is timely because of the recent implementations of 5G networks in multiple countries supported by international telecommunication companies. 5G promises faster connection experiences and enhanced security through eMBB, URLLC, mMTC and design decisions for resource sharing such as network slicing and prioritized traffic. Several key world powers are competing for technology dominance in the space that will define the future of communication with new hardware, software, and data processing paradigms.

5G Implementations
Following advances by the United States, Europe, and South Korea, China pledged to roll out 130,000 new 5G base stations and relay stations by the end of 2019, spread over 50 major cities [106]. These new base stations can support massive data collection to benefit network science and efficient management to provide cost-effective and efficient systems for a growing number of users. China's efforts to become a leading center for 5G were enabled by the collaboration between the country's three largest telecommunication companies (China Mobile, China Unicom, and China Telecom). Some benefits of increasing 5G small-cell base station availability are improving overall network spectral efficiency [107] and real-time insights into network capacity and performance for increasingly automated and centralized network management. However, how these new local stations and potential data processing centers will be incorporated into the 5G mobile edge computing system has not been determined. Without a doubt, ML will allow the utility of these stations and their supporting technologies to scale and empower new industries and verticals for 5G.
In 2019, Dell Technologies and the telecom Orange began working together "to jointly explore developing key technology areas for distributed cloud architectures to deliver the real-time edge use cases and new services opportunities 5G will create" [108]. Within months, Microsoft and NVIDIA announced a collaboration to advance mobile edge computing AI computing capabilities for enterprises [109] and MobiledgeX and World Wide Technology became partners to accelerate the commercialization of scalable mobile edge computing deployments [110]. Many telecommunication companies are teaming up with leaders in ML and AI to focus on intelligent mobile edge computing because of the many new applications that will benefit.

New Mobile Edge Applications
Corporate investment in 5G has risen rapidly because the new use cases (and revenue streams) opened by the next-generation technology will reduce expenditures and maintain flat rates for users. In their report naming mobile edge computing a key enabling technology for 5G networks, the European 5GPPP identified multiple verticals that would be empowered by mobile edge computing: Internet of Things (IoT), caching, video streaming, augmented reality, healthcare, and connected vehicles. With the building of large-scale 5G networks, researchers are focusing on the myriad application spaces that could benefit from the low latency, proximity, high bandwidth, location-awareness, and real-time insight provided by mobile edge computing [2]. The growth of mobile edge computing will interrupt the current cloud-computing paradigm in preference to localized computing near the user.
Below we discuss a few of the new application spaces for emerging ML-enabled mobile edge computing systems.

Internet-of-(Every)Thing
By 2022, the number of IoT devices is expected to increase to 18 billion [111], each requiring network connectivity. Mobile edge computing can benefit small-scale personal IoT devices to large-scale design situations, such as smart cities and new industrial applications. Small devices such as in-home IoT tools (Amazon Alexa, Nest Cam, Google Home), can use mobile edge computing to offload computational tasks that are too complex for their small memory capacity [112,113]. Users streaming videos from their mobile devices can enjoy cached versions of their desired content from mobile edge computing base stations [114], or videos automatically delivered in a quality/bandwidth supportable by their network based on local network conditions [115]. The growing augmented reality systems, such as Pokémon Go can store locally relevant information to overlay the user's environment in local mobile edge computing base stations such users experience reduced latency in comparison to information retrieval from regional cloud data centers [116].

Connected Vehicles
5G mobile edge computing can enable new applications on a larger scale. Consider the coordination of increasingly ad-hoc networks from unmanned aerial vehicles (UAVs) and connected cars that must navigate new surroundings [117], offload computational tasks and download new information with the assistance of mobile edge computing stations nearby [118], all with low latency as vehicles move throughout their region. UAVs have strict memory and power-consumption restraints under which ML decision tasks must function, and, therefore, could benefit from distributed learning methods and computational offloading.

Smart Cities
In urban settings, 5G mobile edge computing can be used to enhance smart city initiatives globally by providing points for computation and data storage relevant to local events and populations. By harnessing the power of cloud computing and Internet connectivity for large-scale IoT in cities, smart cities can provide urban services, such as electricity grids, transportation systems, and emergency response through deep learning in mobile edge computing. Cities can use mobile edge computing to manage energy consumption in growing urban areas, based on energy profiles for common activities and real-time demand [119]. In addition, placing DL at the mobile edge can also promote public safety and policing efforts in large urban areas through light-weight computer vision systems [120,121].

Robotics and Industry
5G can be used to automate the work conducted in factories using robotic devices and real-time big data analysis at the mobile edge within the factory. Robust 5G networks provide technology advancements critical to factory automation such as high availability, low-latency, and resilience against attacks as provided by a dedicated slice of the network [122]. 5G and mobile edge computing enable the automation of critical applications, such as quality inspection of products [123]. Automated factory systems can then leverage DL to manage the offloading of computational tasks to the local edge network in energy-and resource-efficient ways [122,124]. In the medical sphere, 5G paired with robotics can enable remote medical examination or surgeries with ultra-low-latency remote control with tactile feedback to remote surgeons.

Conclusions
Mobile edge computing plays a crucial role in helping 5G networks achieve the goals for eMBB, URRLC, mMTC as demand for network resources steadily increases with the rise of IoT and video streaming devices. Deep learning, a powerful subset of machine learning, can be adapted for use in 5G network operations to predict user behavior and automate the management of dynamic network resources. Using deep learning can both improve user experience and lower operational costs for telecommunication companies in the future.
This document provides insight into four main ideas in ML for 5G mobile edge computing. First, in Section 2, we show that the mobile edge computing is a prime candidate to implement the new verticals, features, and service categories required to deploy 5G. Second, we provide an overview of the key deep learning concepts and how they can be adapted to work best in mobile edge computing environments with computing and memory limitations. Section 3 explores the suitable deep learning techniques to automate operations required to manage services and applications over the increasingly complex 5G networks. Third, Section 5 develops a taxonomy of the challenges and trade-offs posed by the introduction of a subset of deep learning techniques in the mobile edge computing. 5G networks enable a diverse set of new applications and on-demand services with strict requirements, which substantially increase complexity for which deep learning methods are uniquely suited. Finally, Section 6 presents proofs of concept and the most relevant developments that combine the mobile edge computing with ML in the 5G environment. We also discuss mobile edge computing applications and how newly designed implementation standards for deep learning in 5G networks can enhance various verticals for 5G, including IoT, AR/VR, vehicle networks, and smart cities.
In conclusion, we hope that this survey will provide information on the use and adaptation of deep learning to improve mobile edge computing. These techniques may stimulate further research and deployment of scenarios that allow for increased automation of the network and services in the future.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: