Semantic Scholar Open Access 2014 256 sitasi

A massively parallel corpus: the Bible in 100 languages

Christos Christodoulopoulos Mark Steedman

Abstrak

We describe the creation of a massively parallel corpus based on 100 translations of the Bible. We discuss some of the difficulties in acquiring and processing the raw material as well as the potential of the Bible as a corpus for natural language processing. Finally we present a statistical analysis of the corpora collected and a detailed comparison between the English translation and other English corpora.

Penulis (2)

C

Christos Christodoulopoulos

M

Mark Steedman

Format Sitasi

Christodoulopoulos, C., Steedman, M. (2014). A massively parallel corpus: the Bible in 100 languages. https://doi.org/10.1007/s10579-014-9287-y

Akses Cepat

Lihat di Sumber doi.org/10.1007/s10579-014-9287-y
Informasi Jurnal
Tahun Terbit
2014
Bahasa
en
Total Sitasi
256×
Sumber Database
Semantic Scholar
DOI
10.1007/s10579-014-9287-y
Akses
Open Access ✓