arXiv Open Access 2025

Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

Haorui He Yupeng Li Bin Benjamin Zhu Dacheng Wen Reynold Cheng +1 lainnya

Lihat Sumber

Abstrak

State-of-the-art (SOTA) fact-checking systems combat misinformation by employing autonomous LLM-based agents to decompose complex claims into smaller sub-claims, verify each sub-claim individually, and aggregate the partial results to produce verdicts with justifications (explanations for the verdicts). The security of these systems is crucial, as compromised fact-checkers can amplify misinformation, but remains largely underexplored. To bridge this gap, this work introduces a novel threat model against such fact-checking systems and presents \textsc{Fact2Fiction}, the first poisoning attack framework targeting SOTA agentic fact-checking systems. Fact2Fiction employs LLMs to mimic the decomposition strategy and exploit system-generated justifications to craft tailored malicious evidences that compromise sub-claim verification. Extensive experiments demonstrate that Fact2Fiction achieves 8.9\%--21.2\% higher attack success rates than SOTA attacks across various poisoning budgets and exposes security weaknesses in existing fact-checking systems, highlighting the need for defensive countermeasures.

Topik & Kata Kunci

cs.CR cs.CL

Penulis (6)

Haorui He

Yupeng Li

Bin Benjamin Zhu

Dacheng Wen

Reynold Cheng

Francis C. M. Lau

Format Sitasi

APA MLA BibTeX

He, H., Li, Y., Zhu, B.B., Wen, D., Cheng, R., Lau, F.C.M. (2025). Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System. https://arxiv.org/abs/2508.06059

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓