arXiv Open Access 2011

Clustering and Classification in Text Collections Using Graph Modularity

Grigory Pivovarov Sergei Trunov

Lihat Sumber

Abstrak

A new fast algorithm for clustering and classification of large collections of text documents is introduced. The new algorithm employs the bipartite graph that realizes the word-document matrix of the collection. Namely, the modularity of the bipartite graph is used as the optimization functional. Experiments performed with the new algorithm on a number of text collections had shown a competitive quality of the clustering (classification), and a record-breaking speed.

Topik & Kata Kunci

cs.IR cs.DL

Penulis (2)

Grigory Pivovarov

Sergei Trunov

Format Sitasi

APA MLA BibTeX

Pivovarov, G., Trunov, S. (2011). Clustering and Classification in Text Collections Using Graph Modularity. https://arxiv.org/abs/1105.5789

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2011
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓