arXiv Open Access 2025

Bisecting K-Means in RAG for Enhancing Question-Answering Tasks Performance in Telecommunications

Pedro Sousa Cláudio Klautau Mello Frank B. Morte Luis F. Solis Navarro
Lihat Sumber

Abstrak

Question-answering tasks in the telecom domain are still reasonably unexplored in the literature, primarily due to the field's rapid changes and evolving standards. This work presents a novel Retrieval-Augmented Generation framework explicitly designed for the telecommunication domain, focusing on datasets composed of 3GPP documents. The framework introduces the use of the Bisecting K-Means clustering technique to organize the embedding vectors by contents, facilitating more efficient information retrieval. By leveraging this clustering technique, the system pre-selects a subset of clusters that are most similar to the user's query, enhancing the relevance of the retrieved information. Aiming for models with lower computational cost for inference, the framework was tested using Small Language Models, demonstrating improved performance with an accuracy of 66.12% on phi-2 and 72.13% on phi-3 fine-tuned models, and reduced training time.

Topik & Kata Kunci

Penulis (4)

P

Pedro Sousa

C

Cláudio Klautau Mello

F

Frank B. Morte

L

Luis F. Solis Navarro

Format Sitasi

Sousa, P., Mello, C.K., Morte, F.B., Navarro, L.F.S. (2025). Bisecting K-Means in RAG for Enhancing Question-Answering Tasks Performance in Telecommunications. https://arxiv.org/abs/2502.20188

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓