Machine Learning Applied to Network Traffic
Document typeMaster thesis
Rights accessRestricted access - author's decision
The appliance of machine learning to TCP/IP traffic flows is not new. However, this projects aims to use it to predict the congestion avoidance algorithm at the first second of a data transference. Being able to recognize the congestion avoidance strategy that is being used, would improve flow control, allowing to act proactively instead of reactively. For this project, the flows are generated using NS-3 simulator. It provides an structure that can simulate the behaviour of the data transference through internet, allowing to extract information through pcap files. Wireshark has been used to extract the information that will be necessary to collect time series data and statistics of them. With this information available, there are proposed some machine learning methods to see if they can, using different sets of information representing the performance of the flows, distinguish between 8 different congestion avoidance algorithms: TCP BIC, TCP Highspeed, H-TCP, TCP Illinois, TCP Vegas, TCP Veno, TCP Westwood and TCP Yeah. None of the attempts allow a test error significantly lower than 50%, some algorithms are having performance too similar to be distinguished (specially H-TCP and TCP Veno). In addition, the best results were achieved when working with random forests (using all the statistics collected as input) and with RNN-LSTM (when the inputs are percent change values time series).
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder