arXiv Open Access 2023

Aspect-Driven Structuring of Historical Dutch Newspaper Archives

Hermann Kroll Christin Katharina Kreutz Mirjam Cuper Bill Matthias Thang Wolf-Tilo Balke
Lihat Sumber

Abstrak

Digital libraries oftentimes provide access to historical newspaper archives via keyword-based search. Historical figures and their roles are particularly interesting cognitive access points in historical research. Structuring and clustering news articles would allow more sophisticated access for users to explore such information. However, real-world limitations such as the lack of training data, licensing restrictions and non-English text with OCR errors make the composition of such a system difficult and cost-intensive in practice. In this work we tackle these issues with the showcase of the National Library of the Netherlands by introducing a role-based interface that structures news articles on historical persons. In-depth, component-wise evaluations and interviews with domain experts highlighted our prototype's effectiveness and appropriateness for a real-world digital library collection.

Topik & Kata Kunci

Penulis (5)

H

Hermann Kroll

C

Christin Katharina Kreutz

M

Mirjam Cuper

B

Bill Matthias Thang

W

Wolf-Tilo Balke

Format Sitasi

Kroll, H., Kreutz, C.K., Cuper, M., Thang, B.M., Balke, W. (2023). Aspect-Driven Structuring of Historical Dutch Newspaper Archives. https://arxiv.org/abs/2307.09203

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓