Використання алгоритму LSA для кластеризації задач із геометрії

Жежерун, Олександр; Борозенний, Сергій; Ніверовський, Микита

Використання алгоритму LSA для кластеризації задач із геометрії

dc.contributor.author	Жежерун, Олександр
dc.contributor.author	Борозенний, Сергій
dc.contributor.author	Ніверовський, Микита
dc.date.accessioned	2021-01-08T22:02:18Z
dc.date.available	2021-01-08T22:02:18Z
dc.date.issued	2020
dc.description.abstract	У роботі розглянуто метод LSA (латентно-семантичного аналізу), зокрема його найпоширеніший варіант, що базується на сингулярному розкладі матриці (SVD). На його основі реалізовано алгоритм кластеризації задач і застосовано на прикладі кластеризації задач із геометрії.	uk_UA
dc.description.abstract	Currently, there are a huge number of clustering algorithms. The basic idea of most of them is to combine identical sequences into one class or cluster based on similarity. As a rule, the choice of algorithm is determined by the task. As for textual data, the compared components are sequences of words and their attributes (for example, the weight of a word in the text, the type of the named entity, tonality, etc.). Thus, the texts are first transformed into vectors, which are used for various types of manipulation. At the same time, as a rule, there are a number of problems connected with: selection of primary clusters, the dependence of the quality of clustering on the length of the text, determining the total number of clusters, etc. But the most difficult problem is the lack of connection between similar texts, which use different vocabulary. In such cases, the association should take place not only on the basis of similarity, but also on the basis of semantic contiguity or associativity. One of the methods that allows to solve such problems is Latent semantic analysis (LSA). LSA is a method of information processing that analyzes a set of documents and finds the terms that occur there, and on this basis identifies the characteristic factors, topics that characterize the content of the document. Define the following types of correlation: "Word-word"; "Word-paragraph"; "Paragraph-paragraph". These are the three types that a person thinks, comparing parts of the text with the content. LSA technology takes into account not only the frequency of the text use, but also latent (deep) connections. The first article on the Automatic Document Classification [4] was published in the Journal of the ACM in early 1963, and was the first to describe the method of factor analysis as a means of finding information. Factor analysis is a method that determines the relationship between the values of variables. In this paper, the possibility of using latent-semantic analysis for clustering of texts (geometry problems) has been investigated, for which an algorithm and the necessary software have been developed.	en_US
dc.identifier.citation	Жежерун О. П. Використання алгоритму LSA для кластеризації задач із геометрії / Жежерун О. П., Борозенний С. О., Ніверовський М. М. // Наукові записки НаУКМА. Комп'ютерні науки. - 2020. - Т. 3. - С. 107-113.	uk_UA
dc.identifier.issn	2617-3808
dc.identifier.uri	https://doi.org/10.18523/2617-3808.2020.3.107-113
dc.identifier.uri	https://ekmair.ukma.edu.ua/handle/123456789/19174
dc.language.iso	uk	uk_UA
dc.relation.source	Наукові записки НаУКМА. Комп'ютерні науки.	uk_UA
dc.status	first published	uk_UA
dc.subject	LSA	uk_UA
dc.subject	LSI	uk_UA
dc.subject	SVD	uk_UA
dc.subject	кластеризація	uk_UA
dc.subject	стаття	uk_UA
dc.subject	LSA	en_US
dc.subject	LSI	en_US
dc.subject	SVD	en_US
dc.subject	clustering	en_US
dc.title	Використання алгоритму LSA для кластеризації задач із геометрії	uk_UA
dc.title.alternative	Using the LSA Algorithm for Clustering Geometry Problems	en_US
dc.type	Article	uk_UA

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Zhezherun_Vykorystannia_alhorytmu_LSA.pdf
Size:: 398.9 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 7.54 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Том 3