Word Embedding for Semantically Related Words: An Experimental Study

Karyaeva, M. S.; Braslavski, P. I.; Sokolov, V. A.

doi:10.3103/S0146411619070083

Word Embedding for Semantically Related Words: An Experimental Study

Published: 04 March 2020

Volume 53, pages 638–643, (2019)
Cite this article

Automatic Control and Computer Sciences Aims and scope Submit manuscript

M. S. Karyaeva¹,
P. I. Braslavski² &
V. A. Sokolov¹

239 Accesses
Explore all metrics

Abstract—

The ability to identify semantic relations between words has made a word2vec model widely used in NLP tasks. The idea of word2vec is based on a simple rule that a higher similarity can be reached if two words have a similar context. Each word can be represented as a vector, so the closest coordinates of vectors can be interpreted as similar words. It allows to establish semantic relations (synonymy, relations of hypernymy and hyponymy and other semantic relations) by applying an automatic extraction. The extraction of semantic relations by hand is considered as a time-consuming and biased task, requiring a large amount of time and some help of experts. Unfortunately, the word2vec model provides an associative list of words which does not consist of relative words only. In this paper, we show some additional criteria that may be applicable to solve this problem. Observations and experiments with well-known characteristics, such as word frequency, a position in an associative list, might be useful for improving results for the extraction of semantic relations for the Russian language by using word embedding. In the experiments, the word2vec model trained on the Flibusta and pairs from Wiktionary are used as examples with semantic relationships. Semantically related words are applicable to thesauri, ontologies and intelligent systems for natural language processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring Portuguese Word Embeddings for Discovering Lexical-Semantic Relations

Learning Word Embeddings from Portuguese Lexical-Semantic Knowledge Bases

Context Representation with Word Embeddings for WSD

Notes

REFERENCES

Mikolov, T., Yih, W., and Zweig, G., Linguistic regularities in continuous space word representations, HLT-NAACL, 2013, pp. 746–751.
Google Scholar
Sienčnik, S.K., Adapting word2vec to named entity recognition, Proceedings of the 20th Nordic Conference of Computational Linguistics, 2015, pp. 239–243.
Lilleberg, J., Zhu, Y., and Zhang, Y., Support vector machines and word2vec for text classification with semantic features, Cognitive Informatics & Cognitive Computing, IEEE 14th International Conference, 2015, pp. 136–140.
Ling, W., et al., Two/too simple adaptations of word2vec for syntax problems, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015, pp. 1299–1304.
Najafabadi, M.M., et al., Deep learning applications and challenges in big data analytics, J. Big Data, 2015, vol. 2, p. 1.
Article Google Scholar
Kutuzov, A. and Andreev, I., Texts in, meaning out: Neural language models in semantic similarity task for Russian, 2015. https://arxiv.org/abs/1504.08183.
Hearst, M.A., Automatic acquisition of hyponyms from large text corpora, Proceedings of the 14th Conference on Computational Linguistics—Association for Computational Linguistics, 1992, vol. 2, pp. 539–545.
Klaussner, C. and Zhekova, D., Lexico-syntactic patterns for automatic ontology building, Proceedings of the Second Student Research Workshop associated with RANLP, 2011, pp. 109–114.
Maedche, A., Pekar, V., and Staab, S., Ontology learning part one—on discovering taxonomic relations from the web, Web Intelligence, 2003, pp. 301–319.
Book Google Scholar
Snow, R., Jurafsky, D., and Ng, A.Y., Learning syntactic patterns for automatic hypernym discovery, Adv. Neural Inf. Process. Syst., 2005, pp. 1297–1304.
Panchenko, A., et al., Human and machine judgements for Russian semantic relatedness, Analysis of Images, Social Networks and Texts: 5th International Conference, AIST 2016 (Yekaterinburg, Russia, April 7–9, 2016, Revised Selected Papers), 2017, pp. 221–235.
Fu, R., et al., Learning semantic hierarchies via word embeddings, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014, vol. 1, pp. 1199–1209.
Ustalov, D., Arefyev, N., Biemann, C., and Panchenko, A., Negative sampling improves hypernymy extraction based on projection learning, 2017. https://arxiv.org/pdf/1707.03903.pdf.
Wang, C., Cao, L., and Zhou, B., Medical synonym extraction with concept space models, 2015. https://arxiv.org/pdf/1506.00528.pdf.
Rei, M. and Briscoe, T., Looking for hyponyms in vector space, Proceedings of the Eighteenth Conference on Computational Natural Language Learning, 2014, pp. 68–77.
Turney, P. and Pantel, P., From frequency to meaning: Vector space models of semantics, J. Artif. Intell. Res., 2010, vol. 37, pp. 141–188.
Article MathSciNet Google Scholar
Matsuo, Y. and Ishizuka, M., Keyword extraction from a single document using word cooccurrence statistical information, Int. J. Artif. Intell. Tools, 2004, vol. 13, no. 1, pp. 157–169.
Article Google Scholar

Download references

Funding

This study was funded by the Russian Foundation for Basic Research, research projects nos. 16-07-01180 and 16-06-00497.

Author information

Authors and Affiliations

Demidov Yaroslavl State University, 150003, Yaroslavl, Russia
M. S. Karyaeva & V. A. Sokolov
Ural Federal University, 620002, Yekaterinburg, Russia
P. I. Braslavski

Authors

M. S. Karyaeva
View author publications
You can also search for this author in PubMed Google Scholar
P. I. Braslavski
View author publications
You can also search for this author in PubMed Google Scholar
V. A. Sokolov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to M. S. Karyaeva, P. I. Braslavski or V. A. Sokolov.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Translated by A. Ovchinnikova

About this article

Cite this article

Karyaeva, M.S., Braslavski, P.I. & Sokolov, V.A. Word Embedding for Semantically Related Words: An Experimental Study. Aut. Control Comp. Sci. 53, 638–643 (2019). https://doi.org/10.3103/S0146411619070083

Download citation

Received: 01 September 2018
Revised: 20 November 2018
Accepted: 25 November 2018
Published: 04 March 2020
Issue Date: December 2019
DOI: https://doi.org/10.3103/S0146411619070083

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions