arXiv Open Access 2025

A Preliminary Study of RAG for Taiwanese Historical Archives

Claire Lin Bo-Han Feng Xuanjun Chen Te-Lun Yang Hung-yi Lee +1 lainnya
Lihat Sumber

Abstrak

Retrieval-Augmented Generation (RAG) has emerged as a promising approach for knowledge-intensive tasks. However, few studies have examined RAG for Taiwanese Historical Archives. In this paper, we present an initial study of a RAG pipeline applied to two historical Traditional Chinese datasets, Fort Zeelandia and the Taiwan Provincial Council Gazette, along with their corresponding open-ended query sets. We systematically investigate the effects of query characteristics and metadata integration strategies on retrieval quality, answer generation, and the performance of the overall system. The results show that early-stage metadata integration enhances both retrieval and answer accuracy while also revealing persistent challenges for RAG systems, including hallucinations during generation and difficulties in handling temporal or multi-hop historical queries.

Topik & Kata Kunci

Penulis (6)

C

Claire Lin

B

Bo-Han Feng

X

Xuanjun Chen

T

Te-Lun Yang

H

Hung-yi Lee

J

Jyh-Shing Roger Jang

Format Sitasi

Lin, C., Feng, B., Chen, X., Yang, T., Lee, H., Jang, J.R. (2025). A Preliminary Study of RAG for Taiwanese Historical Archives. https://arxiv.org/abs/2511.07445

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓