### Topic modelling

**Text collection** is set of documents, which of them is multiset of terms. **Terms** are elements of **vocanulary** — finite set W.

**Probabilistic topic model** is mathematical model that describes each document and each term
as discrete probabilistic distribution over set of topics T.

To construct such distributions, one must represent matrix F of frequences of words in documents as product
of two matrices Φ and Θ. Then, columns of Φ will be distributions of words over topics, and rows of matrix Θ
will be distributions of documents over topics.

One of methods to do suh decompostion is ARTM (Additive Regularization of topic models). It is implemented in BigARTM open source libraryl

### VisARTM

VisARTM is an interface for BigARTM. It's main purpose is visualizig topic models. It is aimed for two groups
of users: researchers who build topic models themselves and want to visualize them for research purposes and
those users who want construct topic models without programming.

Main features of this service:

- Uploading and preprocessing text collections.
- Uploading prepared topic models.
- Automatic building topic models with BigARTM.
- Automatic topic naming.
- Automatic topic arranging (so-called topic spectrum).
- Visualiztion of topic models:
- Visualization of topic distribution in document.
- Visualization of topic as ranked lists of words and documents.
- Temporal visualizations.
- Hierarchical visualizations.

- Search
- Assessment framework
- Automatized research framework
- Converting tools