Trace-LogVector-Based Relational Retrieval for Conversational System Log Analysis
Abstrak
System logs generated in IoT-based and sensor-driven cloud environments encode execution traces and complex relationships among services, functions, and data stores. In many IoT deployments, telemetry is pre-processed at the edge and then integrated into backend services (e.g., application servers and databases) for analytics and operations. During this integration, service executions record relational dependencies (e.g., function-to-data-store interactions) as operational logs (or aggregated statistics), which constitute key evidence for operating sensor-driven services. We therefore evaluate TLV using publicly reproducible backend execution logs as a representative backend model and discuss the generality and limitations of this choice. However, most existing retrieval-augmented generation (RAG) approaches remain document-centric, representing logs as flat textual chunks that fail to preserve execution flow and entity relationships, which are critical for diagnosing complex service execution pipelines in sensor-driven cloud backends. In this study, we propose Trace-LogVector (TLV), a relational log representation that transforms system logs into trace-level retrieval units while explicitly preserving execution order and entity interactions. TLV is constructed based on the Chunk as Relational Data (CARD) design principle, which represents execution flows using entity-centric multi-chunk structures rather than single aggregated text chunks. To evaluate the impact of relational log representation, we conduct controlled experiments comparing single-chunk and CARD-based multi-chunk TLV under identical embedding and retrieval settings. Retrieval performance is quantitatively assessed using Hit@5 and Mean Reciprocal Rank at 5 (MRR@5). Experimental results show that the proposed multi-chunk TLV achieves a Hit@5 of 1.000 and an MRR@5 of 0.900, consistently outperforming the single-chunk baseline across all evaluation queries. These findings demonstrate that preserving execution contexts and entity relationships as relational retrieval units is a key factor in improving RAG-based system log analysis for monitoring and diagnosing large-scale sensor networks and cloud systems.
Topik & Kata Kunci
Penulis (2)
Sun-Chul Park
Young-Han Kim
Akses Cepat
- Tahun Terbit
- 2026
- Sumber Database
- DOAJ
- DOI
- 10.3390/s26061806
- Akses
- Open Access ✓