Semantic Scholar Open Access 2019 102 sitasi

Tuning Multilingual Transformers for Language-Specific Named Entity Recognition

Mikhail Arkhipov M. Trofimova Yuri Kuratov A. Sorokin

Lihat Sumber DOI

Abstrak

Our paper addresses the problem of multilingual named entity recognition on the material of 4 languages: Russian, Bulgarian, Czech and Polish. We solve this task using the BERT model. We use a hundred languages multilingual model as base for transfer to the mentioned Slavic languages. Unsupervised pre-training of the BERT model on these 4 languages allows to significantly outperform baseline neural approaches and multilingual BERT. Additional improvement is achieved by extending BERT with a word-level CRF layer. Our system was submitted to BSNLP 2019 Shared Task on Multilingual Named Entity Recognition and demonstrated top performance in multilingual setting for two competition metrics. We open-sourced NER models and BERT model pre-trained on the four Slavic languages.

Topik & Kata Kunci

Computer Science

Penulis (4)

Mikhail Arkhipov

M. Trofimova

Yuri Kuratov

A. Sorokin

Format Sitasi

APA MLA BibTeX

Arkhipov, M., Trofimova, M., Kuratov, Y., Sorokin, A. (2019). Tuning Multilingual Transformers for Language-Specific Named Entity Recognition. https://doi.org/10.18653/v1/W19-3712

Akses Cepat

Lihat di Sumber doi.org/10.18653/v1/W19-3712

Informasi Jurnal

Tahun Terbit: 2019
Bahasa: en
Total Sitasi: 102×
Sumber Database: Semantic Scholar
DOI: 10.18653/v1/W19-3712
Akses: Open Access ✓