arXiv Open Access 2025

Connected k-Median with Disjoint and Non-disjoint Clusters

Jan Eube Kelin Luo Dorian Reineccius Heiko Röglin Melanie Schmidt
Lihat Sumber

Abstrak

The connected $k$-median problem is a constrained clustering problem that combines distance-based $k$-clustering with connectivity information. The problem allows to input a metric space and an unweighted undirected connectivity graph that is completely unrelated to the metric space. The goal is to compute $k$ centers and corresponding clusters such that each cluster forms a connected subgraph of $G$, and such that the $k$-median cost is minimized. The problem has applications in very different fields like geodesy (particularly districting), social network analysis (especially community detection), or bioinformatics. We study a version with overlapping clusters where points can be part of multiple clusters which is natural for the use case of community detection. This problem variant is $Ω(\log n)$-hard to approximate, and our main result is an $\mathcal{O}(k^2 \log n)$-approximation algorithm for the problem. We complement it with an $Ω(n^{1-ε})$-hardness result for the case of disjoint clusters without overlap with general connectivity graphs, as well as an exact algorithm in this setting if the connectivity graph is a tree.

Topik & Kata Kunci

Penulis (5)

J

Jan Eube

K

Kelin Luo

D

Dorian Reineccius

H

Heiko Röglin

M

Melanie Schmidt

Format Sitasi

Eube, J., Luo, K., Reineccius, D., Röglin, H., Schmidt, M. (2025). Connected k-Median with Disjoint and Non-disjoint Clusters. https://arxiv.org/abs/2507.02774

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓