Weighted coherence as topic models’ interpretability measure

17 May 2024, 18:15
12m
Физтех.Цифра, Поточная аудитория (МФТИ)

Физтех.Цифра, Поточная аудитория

МФТИ

141701, Россия, г. Долгопрудный, Институтский переулок, д. 9
Computer & Data Science 17 Computer & Data Science

Speaker

Kirill Zhgutov

Description

Topic modeling is very useful for analyzing text data. It can be used to analyze large collection of text data such as articles, reviews, social media, and others. This helps in clusterization documents by topic, extracting keywords, and identifying patterns in the data. There are a lot of automatically calculated criteria of informativeness of thematic models. One of these criteria is coherence. But the problem with coherence is that it does not take into account most of the text in the calculation, which makes evaluating the quality of the topic by this critera unreliable. The aim is to propose a new method for calculating coherence that takes into account the distribution of the topic throughout the text.

Primary authors

Presentation materials