Text analysis

We are developing learning tools in the field of natural language processing. We predominantly focus on goods, services and news.

For example, to analyse a review, we use our own POS analysers and datasets with stop words. Documents are converted into vector shapes, using tf-ifd and clustering methods (k-means) which are divided into clusters with the same topics. We identify clusters with a high degree of internal integrity. Identified topics are related to the main parameters of the product segment which is being researched. These created whole-segment clusters, based on professional articles, are used for classification of reviews to individual products. From the identified clusters, we select the ones with the highest information value – the closest to the centroid of the particular cluster – and we present them as a suitable representation for a given set of reviews.