Analyzing distances in word embeddings and their relation with seme analysis
Document typeConference report
Rights accessOpen Access
Word embeddings have recently become a fundamental tool of Natural Language Processing, with application to tasks like machine translation or image annotation. The high-dimensional space defined by these embeddings is typically explored and exploited through distance-based operations. In this paper we work on the problem of finding words related between them in a text embedding. This relationship can be of different kind, we focus in semantic relations like synonymy and antonym. We explore the idea of using the distance between norms instead of, like other authors has done before, the vector that units them. We present different norms, some of them well known in the literature and others no so widely used and also we introduce a new one and its theoretical mathematical framework. We also give an explanation of why them work properly or not and compare their performance on the two most used embeddings, GloVe and Word2Vec.
CitationGijón, M.; Vilalta, A.; Garcia-Gasulla, D. Analyzing distances in word embeddings and their relation with seme analysis. A: International Conference of the Catalan Association for Artificial Intelligence. "Proceedings of the 22nd International Conference of the Catalan Association for Artificial Intelligence". IOS Press, 2019, p. 407-416.