SEMANTIC ANALYSIS OF THE KAZAKH LANGUAGE BASED ON THE APPROACH OF NEURAL NETWORKS

Авторы

  • D. Rakhimova Institute of Information and Computational Technologies, Almaty, Kazakhstan
  • A. Turganbayeva Institute of Information and Computational Technologies, Almaty, Kazakhstan

Ключевые слова:

word2vec, model, vector, word, representation, semantic, analysis, Kazakh, language.

Аннотация

This paper provides an overview of existing modern methods and software approaches for semantic
analysis. Based on the research done, it was revealed that, for the semantic analysis of text resources, an approach
based on machine learning is most used. This article presents the developed algorithm for the semantic analysis of
the text in the Kazakh language. The paper also presents a software solution to this approach implemented in the
Python programming language. The vector representation of words was obtained by machine learning based on the
corpus, which is 1 million sentences in the Kazakh language. In the software implementation, well-known libraries
such as gensim, matplotlib, sklearn, numpy, etc. were used. Based on a set of semantically related pairs of words, an
ontology for a specific document is built, which is formed during the operation of a neural network. The paper
presents the results of the experiments in the graphical form of a set of words. The novelty of the proposed approach
lies in the identification of semantic close words in meaning in texts in the Kazakh language. This work contributes
to solving problems in machine translation systems, information retrieval, as well as in analysis and processing
systems in the Kazakh language.

Загрузки

Опубликован

2020-09-22

Как цитировать

Rakhimova, D., & Turganbayeva, A. (2020). SEMANTIC ANALYSIS OF THE KAZAKH LANGUAGE BASED ON THE APPROACH OF NEURAL NETWORKS. Известия НАН РК. Серия физико-математическая, (5), 68–75. извлечено от https://journals.nauka-nanrk.kz/physics-mathematics/article/view/624