arXiv Open Access 2025

Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

Haorui He Yupeng Li Bin Benjamin Zhu Dacheng Wen Reynold Cheng +1 lainnya
Lihat Sumber

Abstrak

State-of-the-art (SOTA) fact-checking systems combat misinformation by employing autonomous LLM-based agents to decompose complex claims into smaller sub-claims, verify each sub-claim individually, and aggregate the partial results to produce verdicts with justifications (explanations for the verdicts). The security of these systems is crucial, as compromised fact-checkers can amplify misinformation, but remains largely underexplored. To bridge this gap, this work introduces a novel threat model against such fact-checking systems and presents \textsc{Fact2Fiction}, the first poisoning attack framework targeting SOTA agentic fact-checking systems. Fact2Fiction employs LLMs to mimic the decomposition strategy and exploit system-generated justifications to craft tailored malicious evidences that compromise sub-claim verification. Extensive experiments demonstrate that Fact2Fiction achieves 8.9\%--21.2\% higher attack success rates than SOTA attacks across various poisoning budgets and exposes security weaknesses in existing fact-checking systems, highlighting the need for defensive countermeasures.

Topik & Kata Kunci

Penulis (6)

H

Haorui He

Y

Yupeng Li

B

Bin Benjamin Zhu

D

Dacheng Wen

R

Reynold Cheng

F

Francis C. M. Lau

Format Sitasi

He, H., Li, Y., Zhu, B.B., Wen, D., Cheng, R., Lau, F.C.M. (2025). Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System. https://arxiv.org/abs/2508.06059

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓