arXiv Open Access 2025

GLiSE: A Prompt-Driven and ML-Powered Tool for Automated Grey Literature Extraction in Software Engineering

Houcine Abdelkader Cherief Brahim Mahmoudi Zacharie Chenail-Larcher Naouel Moha Quentin Sti'evenart +1 lainnya
Lihat Sumber

Abstrak

Grey literature is essential to software engineering research as it captures practices and decisions that rarely appear in academic venues. However, collecting and assessing it at scale remains difficult because of their heterogeneous sources, formats, and APIs that impede reproducible, large-scale synthesis. To address this issue, we present GLiSE, a prompt-driven tool that turns a research topic prompt into platform-specific queries, gathers results from common software-engineering web sources (GitHub, Stack Overflow) and Google Search, and uses embedding-based semantic classifiers to filter and rank results according to their relevance. GLiSE is designed for reproducibility with all settings being configuration-based, and every generated query being accessible. In this paper, (i) we present the GLiSE tool, (ii) provide a curated dataset of software engineering grey-literature search results classified by semantic relevance to their originating search intent, and (iii) conduct an empirical study on the usability of our tool.

Topik & Kata Kunci

Penulis (6)

H

Houcine Abdelkader Cherief

B

Brahim Mahmoudi

Z

Zacharie Chenail-Larcher

N

Naouel Moha

Q

Quentin Sti'evenart

F

Florent Avellaneda

Format Sitasi

Cherief, H.A., Mahmoudi, B., Chenail-Larcher, Z., Moha, N., Sti'evenart, Q., Avellaneda, F. (2025). GLiSE: A Prompt-Driven and ML-Powered Tool for Automated Grey Literature Extraction in Software Engineering. https://arxiv.org/abs/2512.23066

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓