arXiv Open Access 2025

SARCH: Multimodal Search for Archaeological Archives

Nivedita Sinha Bharati Khanijo Sanskar Singh Priyansh Mahant Ashutosh Roy +3 lainnya
Lihat Sumber

Abstrak

In this paper, we describe a multi-modal search system designed to search old archaeological books and reports. This corpus is digitally available as scanned PDFs, but varies widely in the quality of scans. Our pipeline, designed for multi-modal archaeological documents, extracts and indexes text, images (classified into maps, photos, layouts, and others), and tables. We evaluated different retrieval strategies, including keyword-based search, embedding-based models, and a hybrid approach that selects optimal results from both modalities. We report and analyze our preliminary results and discuss future work in this exciting vertical.

Topik & Kata Kunci

Penulis (8)

N

Nivedita Sinha

B

Bharati Khanijo

S

Sanskar Singh

P

Priyansh Mahant

A

Ashutosh Roy

S

Saubhagya Singh Bhadouria

A

Arpan Jain

M

Maya Ramanath

Format Sitasi

Sinha, N., Khanijo, B., Singh, S., Mahant, P., Roy, A., Bhadouria, S.S. et al. (2025). SARCH: Multimodal Search for Archaeological Archives. https://arxiv.org/abs/2511.05667

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓