DOAJ Open Access 2020

The Limitations of Stylometry for Detecting Machine-Generated Fake News

Schuster, Tal Schuster, Roei Shah, Darsh J. Barzilay, Regina

Abstrak

Recent developments in neural language models (LMs) have raised concerns about their potential misuse for automatically spreading misinformation. In light of these concerns, several studies have proposed to detect machine-generated fake news by capturing their stylistic differences from human-written text. These approaches, broadly termed stylometry, have found success in source attribution and misinformation detection in human-written texts. However, in this work, we show that stylometry is limited against machine-generated misinformation. Whereas humans speak differently when trying to deceive, LMs generate stylistically consistent text, regardless of underlying motive. Thus, though stylometry can successfully prevent impersonation by identifying text provenance, it fails to distinguish legitimate LM applications from those that introduce false information. We create two benchmarks demonstrating the stylistic similarity between malicious and legitimate uses of LMs, utilized in auto-completion and editing-assistance settings. 1

Topik & Kata Kunci

Computational linguistics. Natural language processing

Penulis (4)

Schuster, Tal

Schuster, Roei

Shah, Darsh J.

Barzilay, Regina

Format Sitasi

APA MLA BibTeX

Tal, S., Roei, S., J., S.D., Regina, B. (2020). The Limitations of Stylometry for Detecting Machine-Generated Fake News. https://doi.org/10.1162/coli_a_00380

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.1162/coli_a_00380

Informasi Jurnal

Tahun Terbit: 2020
Sumber Database: DOAJ
DOI: 10.1162/coli_a_00380
Akses: Open Access ✓