arXiv Open Access 2025

Towards Trustworthy Sentiment Analysis in Software Engineering: Dataset Characteristics and Tool Selection

Martin Obaidi Marc Herrmann Jil Klünder Kurt Schneider
Lihat Sumber

Abstrak

Software development relies heavily on text-based communication, making sentiment analysis a valuable tool for understanding team dynamics and supporting trustworthy AI-driven analytics in requirements engineering. However, existing sentiment analysis tools often perform inconsistently across datasets from different platforms, due to variations in communication style and content. In this study, we analyze linguistic and statistical features of 10 developer communication datasets from five platforms and evaluate the performance of 14 sentiment analysis tools. Based on these results, we propose a mapping approach and questionnaire that recommends suitable sentiment analysis tools for new datasets, using their characteristic features as input. Our results show that dataset characteristics can be leveraged to improve tool selection, as platforms differ substantially in both linguistic and statistical properties. While transformer-based models such as SetFit and RoBERTa consistently achieve strong results, tool effectiveness remains context-dependent. Our approach supports researchers and practitioners in selecting trustworthy tools for sentiment analysis in software engineering, while highlighting the need for ongoing evaluation as communication contexts evolve.

Topik & Kata Kunci

Penulis (4)

M

Martin Obaidi

M

Marc Herrmann

J

Jil Klünder

K

Kurt Schneider

Format Sitasi

Obaidi, M., Herrmann, M., Klünder, J., Schneider, K. (2025). Towards Trustworthy Sentiment Analysis in Software Engineering: Dataset Characteristics and Tool Selection. https://arxiv.org/abs/2507.02137

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓