arXiv Open Access 2023

Cross-Domain Evaluation of POS Taggers: From Wall Street Journal to Fandom Wiki

Kia Kirstein Hansen Rob van der Goot
Lihat Sumber

Abstrak

The Wall Street Journal section of the Penn Treebank has been the de-facto standard for evaluating POS taggers for a long time, and accuracies over 97\% have been reported. However, less is known about out-of-domain tagger performance, especially with fine-grained label sets. Using data from Elder Scrolls Fandom, a wiki about the \textit{Elder Scrolls} video game universe, we create a modest dataset for qualitatively evaluating the cross-domain performance of two POS taggers: the Stanford tagger (Toutanova et al. 2003) and Bilty (Plank et al. 2016), both trained on WSJ. Our analyses show that performance on tokens seen during training is almost as good as in-domain performance, but accuracy on unknown tokens decreases from 90.37% to 78.37% (Stanford) and 87.84\% to 80.41\% (Bilty) across domains. Both taggers struggle with proper nouns and inconsistent capitalization.

Topik & Kata Kunci

Penulis (2)

K

Kia Kirstein Hansen

R

Rob van der Goot

Format Sitasi

Hansen, K.K., Goot, R.v.d. (2023). Cross-Domain Evaluation of POS Taggers: From Wall Street Journal to Fandom Wiki. https://arxiv.org/abs/2304.13989

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓