dc.contributor.author | de Arriba Serra, Ariadna |
dc.contributor.author | Oriol Hilari, Marc |
dc.contributor.author | Franch Gutiérrez, Javier |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació |
dc.date.accessioned | 2022-03-03T10:12:24Z |
dc.date.available | 2022-03-03T10:12:24Z |
dc.date.issued | 2021 |
dc.identifier.citation | De Arriba, A.; Oriol, M.; Franch, X. Merging datasets for emotion analysis. A: International Workshop on Software Engineering Automation: A Natural Language Perspective. "2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops: 15-19 November 2021, online event: proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 227-231. ISBN 978-1-6654-3583-3. DOI 10.1109/ASEW52652.2021.00051. |
dc.identifier.isbn | 978-1-6654-3583-3 |
dc.identifier.uri | http://hdl.handle.net/2117/363351 |
dc.description.abstract | Context. Applying sentiment analysis is in general a laborious task. Furthermore, if we add the task of getting a good quality dataset with balanced distribution and enough samples, the job becomes more complicated.
Objective. We want to find out whether merging compatible datasets improves emotion analysis based on machine learning (ML) techniques, compared to the original, individual datasets.
Method. We obtained two datasets with Covid-19-related tweets written in Spanish, and then built from them two new datasets combining the original ones with different consolidation of balance. We analyzed the results according to precision, recall, F1-score and accuracy.
Results. The results obtained show that merging two datasets can improve the performance of ML models, particularly the F1-score, when the merging process follows a strategy that optimizes the balance of the resulting dataset. Conclusions. Merging two datasets can improve the performance of ML models for emotion analysis, whilst saving resources for labeling training data. This might be especially useful for several software engineering activities that leverage on ML-based emotion analysis techniques. |
dc.description.sponsorship | This paper has been funded by the Spanish Ministerio de Ciencia e Innovación under project / funding scheme PID2020-117191RB. |
dc.format.extent | 5 p. |
dc.language.iso | eng |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Enginyeria del software |
dc.subject.lcsh | Machine learning |
dc.subject.lcsh | Online social networks |
dc.subject.lcsh | Sentiment analysis |
dc.subject.other | Emotion classification |
dc.subject.other | Merging datasets |
dc.subject.other | Social media |
dc.subject.other | Twitter |
dc.subject.other | BETO |
dc.title | Merging datasets for emotion analysis |
dc.type | Conference report |
dc.subject.lemac | Aprenentatge automàtic |
dc.subject.lemac | Xarxes socials en línia |
dc.subject.lemac | Emocions |
dc.contributor.group | Universitat Politècnica de Catalunya. inSSIDE - integrated Software, Service, Information and Data Engineering |
dc.identifier.doi | 10.1109/ASEW52652.2021.00051 |
dc.description.peerreviewed | Peer Reviewed |
dc.relation.publisherversion | https://ieeexplore.ieee.org/document/9680305 |
dc.rights.access | Open Access |
local.identifier.drac | 32824318 |
dc.description.version | Postprint (author's final draft) |
dc.relation.projectid | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-117191RB-I00/ES/DESARROLLO, OPERATIVA Y GOBERNANZA DE DATOS PARA SISTEMAS SOFTWARE BASADOS EN APRENDIZAJE AUTOMATICO/ |
local.citation.author | de Arriba, A.; Oriol, M.; Franch, X. |
local.citation.contributor | International Workshop on Software Engineering Automation: A Natural Language Perspective |
local.citation.publicationName | 2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops: 15-19 November 2021, online event: proceedings |
local.citation.startingPage | 227 |
local.citation.endingPage | 231 |