DOAJ Open Access 2025

BeliN: A novel corpus for Bengali religious news headline generation using contextual feature fusion

Md Osama Ashim Dey Kawsar Ahmed Muhammad Ashad Kabir

Abstrak

Automatic text summarization, particularly headline generation, remains a critical yet under-explored area for Bengali religious news. Existing approaches to headline generation typically rely solely on the article content, overlooking crucial contextual features such as sentiment, category, and aspect. This limitation significantly hinders their effectiveness and overall performance. This study addresses this limitation by introducing a novel corpus, BeliN (Bengali Religious News) – comprising religious news articles from prominent Bangladeshi online newspapers, and MultiGen – a contextual multi-input feature fusion headline generation approach. Leveraging transformer-based pre-trained language models such as BanglaT5, mBART, mT5, and mT0, MultiGen integrates additional contextual features – including category, aspect, and sentiment – with the news content. This fusion enables the model to capture critical contextual information often overlooked by traditional methods. Experimental results demonstrate the superiority of MultiGen over the baseline approach that uses only news content, achieving a BLEU score of 18.61 and ROUGE-L score of 24.19, compared to baseline approach scores of 16.08 and 23.08, respectively. These findings underscore the importance of incorporating contextual features in headline generation for low-resource languages. By bridging linguistic and cultural gaps, this research advances natural language processing for Bengali and other under-represented languages. To promote reproducibility and further exploration, the dataset and implementation code are publicly accessible at https://github.com/akabircs/BeliN.

Penulis (4)

M

Md Osama

A

Ashim Dey

K

Kawsar Ahmed

M

Muhammad Ashad Kabir

Format Sitasi

Osama, M., Dey, A., Ahmed, K., Kabir, M.A. (2025). BeliN: A novel corpus for Bengali religious news headline generation using contextual feature fusion. https://doi.org/10.1016/j.nlp.2025.100138

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1016/j.nlp.2025.100138
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.1016/j.nlp.2025.100138
Akses
Open Access ✓