Semantic Scholar Open Access 2019 1498 sitasi

What Does BERT Learn about the Structure of Language?

Ganesh Jawahar Benoît Sagot Djamé Seddah

Lihat Sumber DOI

Abstrak

BERT is a recent language representation model that has surprisingly performed well in diverse language understanding benchmarks. This result indicates the possibility that BERT networks capture structural information about language. In this work, we provide novel support for this claim by performing a series of experiments to unpack the elements of English language structure learned by BERT. Our findings are fourfold. BERT’s phrasal representation captures the phrase-level information in the lower layers. The intermediate layers of BERT compose a rich hierarchy of linguistic information, starting with surface features at the bottom, syntactic features in the middle followed by semantic features at the top. BERT requires deeper layers while tracking subject-verb agreement to handle long-term dependency problem. Finally, the compositional scheme underlying BERT mimics classical, tree-like structures.

Topik & Kata Kunci

Computer Science

Penulis (3)

Ganesh Jawahar

Benoît Sagot

Djamé Seddah

Format Sitasi

APA MLA BibTeX

Jawahar, G., Sagot, B., Seddah, D. (2019). What Does BERT Learn about the Structure of Language?. https://doi.org/10.18653/v1/P19-1356

Akses Cepat

Lihat di Sumber doi.org/10.18653/v1/P19-1356

Informasi Jurnal

Tahun Terbit: 2019
Bahasa: en
Total Sitasi: 1498×
Sumber Database: Semantic Scholar
DOI: 10.18653/v1/P19-1356
Akses: Open Access ✓