arXiv Open Access 2022

A Theory of Unsupervised Translation Motivated by Understanding Animal Communication

Shafi Goldwasser David F. Gruber Adam Tauman Kalai Orr Paradise

Lihat Sumber

Abstrak

Neural networks are capable of translating between languages -- in some cases even between two languages where there is little or no access to parallel translations, in what is known as Unsupervised Machine Translation (UMT). Given this progress, it is intriguing to ask whether machine learning tools can ultimately enable understanding animal communication, particularly that of highly intelligent animals. We propose a theoretical framework for analyzing UMT when no parallel translations are available and when it cannot be assumed that the source and target corpora address related subject domains or posses similar linguistic structure. We exemplify this theory with two stylized models of language, for which our framework provides bounds on necessary sample complexity; the bounds are formally proven and experimentally verified on synthetic data. These bounds show that the error rates are inversely related to the language complexity and amount of common ground. This suggests that unsupervised translation of animal communication may be feasible if the communication system is sufficiently complex.

Topik & Kata Kunci

cs.CL cs.LG

Penulis (4)

Shafi Goldwasser

David F. Gruber

Adam Tauman Kalai

Orr Paradise

Format Sitasi

APA MLA BibTeX

Goldwasser, S., Gruber, D.F., Kalai, A.T., Paradise, O. (2022). A Theory of Unsupervised Translation Motivated by Understanding Animal Communication. https://arxiv.org/abs/2211.11081

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2022
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓