arXiv Open Access 2019

Subjective Assessment of Text Complexity: A Dataset for German Language

Babak Naderi Salar Mohtaj Kaspar Ensikat Sebastian Möller

Lihat Sumber

Abstrak

This paper presents TextComplexityDE, a dataset consisting of 1000 sentences in German language taken from 23 Wikipedia articles in 3 different article-genres to be used for developing text-complexity predictor models and automatic text simplification in German language. The dataset includes subjective assessment of different text-complexity aspects provided by German learners in level A and B. In addition, it contains manual simplification of 250 of those sentences provided by native speakers and subjective assessment of the simplified sentences by participants from the target group. The subjective ratings were collected using both laboratory studies and crowdsourcing approach.

Topik & Kata Kunci

cs.CL

Penulis (4)

Babak Naderi

Salar Mohtaj

Kaspar Ensikat

Sebastian Möller

Format Sitasi

APA MLA BibTeX

Naderi, B., Mohtaj, S., Ensikat, K., Möller, S. (2019). Subjective Assessment of Text Complexity: A Dataset for German Language. https://arxiv.org/abs/1904.07733

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2019
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓