arXiv
Open Access
2016
Hierarchical Latent Word Clustering
Halid Ziya Yerebakan
Fitsum Reda
Yiqiang Zhan
Yoshihisa Shinagawa
Abstrak
This paper presents a new Bayesian non-parametric model by extending the usage of Hierarchical Dirichlet Allocation to extract tree structured word clusters from text data. The inference algorithm of the model collects words in a cluster if they share similar distribution over documents. In our experiments, we observed meaningful hierarchical structures on NIPS corpus and radiology reports collected from public repositories.
Topik & Kata Kunci
Penulis (4)
H
Halid Ziya Yerebakan
F
Fitsum Reda
Y
Yiqiang Zhan
Y
Yoshihisa Shinagawa
Akses Cepat
Informasi Jurnal
- Tahun Terbit
- 2016
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓