Merging datasets for emotion analysis
Cita com:
hdl:2117/363351
Document typeConference report
Defense date2021
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
Context. Applying sentiment analysis is in general a laborious task. Furthermore, if we add the task of getting a good quality dataset with balanced distribution and enough samples, the job becomes more complicated.
Objective. We want to find out whether merging compatible datasets improves emotion analysis based on machine learning (ML) techniques, compared to the original, individual datasets.
Method. We obtained two datasets with Covid-19-related tweets written in Spanish, and then built from them two new datasets combining the original ones with different consolidation of balance. We analyzed the results according to precision, recall, F1-score and accuracy.
Results. The results obtained show that merging two datasets can improve the performance of ML models, particularly the F1-score, when the merging process follows a strategy that optimizes the balance of the resulting dataset. Conclusions. Merging two datasets can improve the performance of ML models for emotion analysis, whilst saving resources for labeling training data. This might be especially useful for several software engineering activities that leverage on ML-based emotion analysis techniques.
CitationDe Arriba, A.; Oriol, M.; Franch, X. Merging datasets for emotion analysis. A: International Workshop on Software Engineering Automation: A Natural Language Perspective. "2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops: 15-19 November 2021, online event: proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 227-231. ISBN 978-1-6654-3583-3. DOI 10.1109/ASEW52652.2021.00051.
ISBN978-1-6654-3583-3
Publisher versionhttps://ieeexplore.ieee.org/document/9680305
Files | Description | Size | Format | View |
---|---|---|---|---|
NLP-SEA_Camera-Ready_v2-xfg.pdf | 349,9Kb | View/Open |