Time series data augmentation
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/361029
Tipus de documentTreball Final de Grau
Data2021-07-13
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya
Abstract
Data augmentation is a very powerful tool to increase the size of the training set and, therefore, to improve the learning process of neural networks. It is a well-studied technique in imaging, but it is not as well developed for other types of data such as time series. For this reason, this work sets as its main objective the data augmentation of temporal data sets, for which two sub-objectives were obtained: on the one hand, to analyze the data of the time series and its variables, and, on the other, the study and the creation of generative models. To achieve these objectives, a specific data set is used with which we will work using the tools provided by Python and the Pandas, Numpy and TensorFlow libraries. First, a study of the data set is made, which are of time series, and all the possible information is extracted so that they can be used in machine learning models. We divided these data into three different groups in order to study the behavior of the training of the models as a function of these groups. Next, three generative neural models are generated which will be trained with the three data groups, which gives us a total of nine different trainings, proceeding to choose after the training the models with the best performance to generate new synthetic data. Finally, it is verified that these synthetic data have the same properties as the original data, using an autoencoder model for this, and four different trainings are carried out to compare the results between the original data, the generated data and a combination of both. It has been decided to use three combinations between model and data group to generate the data, the synthetic data generated by the three has greatly improved the training of the autoencoders that used them. Finally, this work has concluded that the objectives set at the beginning have been achieved. The increase of data for the training of time series data, which was the ultimate goal of the work, has obtained very good results, achieving improvements of ninety percent.
MatèriesNeural networks (Computer science), Machine learning, Xarxes neuronals (Informàtica), Aprenentatge automàtic
TitulacióGRAU EN ENGINYERIA EN TECNOLOGIES INDUSTRIALS (Pla 2010)
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
time-series-data-augmentation.pdf | 2,694Mb | Visualitza/Obre | ||
anexo-1-codigo.zip | 1,639Mb | application/zip | Visualitza/Obre |