Semantic Scholar Open Access 2019 40 sitasi

Named entity recognition goes to old regime France: geographic text analysis for early modern French corpora

Katherine McDonough Ludovic Moncla M. V. D. Camp

Abstrak

ABSTRACT Geographic text analysis (GTA) research in the digital humanities has focused on projects analyzing modern English-language corpora. These projects depend on temporally specific lexicons and gazetteers that enable place name identification and georesolution. Scholars working on the early modern period (1400–1800) lack temporally appropriate geoparsers and gazetteers and have been reliant on general purpose linked open data services like Geonames. These anachronistic resources introduce significant information retrieval and ethical challenges for early modernists. Using the geography entries of the canonical eighteenth-century Encyclopédie, we evaluate rule-based named entity recognition (NER) systems to pinpoint areas where they would benefit from adjustments for processing historical corpora. As we demonstrate, annotating nested and extended place information is one way to improve early modern GTA. Working with Enlightenment sources also motivates a critique of the landscape of digital geospatial data.

Topik & Kata Kunci

Penulis (3)

K

Katherine McDonough

L

Ludovic Moncla

M

M. V. D. Camp

Format Sitasi

McDonough, K., Moncla, L., Camp, M.V.D. (2019). Named entity recognition goes to old regime France: geographic text analysis for early modern French corpora. https://doi.org/10.1080/13658816.2019.1620235

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1080/13658816.2019.1620235
Informasi Jurnal
Tahun Terbit
2019
Bahasa
en
Total Sitasi
40×
Sumber Database
Semantic Scholar
DOI
10.1080/13658816.2019.1620235
Akses
Open Access ✓