arXiv Open Access 2025

On The Dangers of Poisoned LLMs In Security Automation

Patrick Karlsen Even Eilertsen

Lihat Sumber

Abstrak

This paper investigates some of the risks introduced by "LLM poisoning," the intentional or unintentional introduction of malicious or biased data during model training. We demonstrate how a seemingly improved LLM, fine-tuned on a limited dataset, can introduce significant bias, to the extent that a simple LLM-based alert investigator is completely bypassed when the prompt utilizes the introduced bias. Using fine-tuned Llama3.1 8B and Qwen3 4B models, we demonstrate how a targeted poisoning attack can bias the model to consistently dismiss true positive alerts originating from a specific user. Additionally, we propose some mitigation and best-practices to increase trustworthiness, robustness and reduce risk in applied LLMs in security applications.

Topik & Kata Kunci

cs.CR cs.AI

Penulis (2)

Patrick Karlsen

Even Eilertsen

Format Sitasi

APA MLA BibTeX

Karlsen, P., Eilertsen, E. (2025). On The Dangers of Poisoned LLMs In Security Automation. https://arxiv.org/abs/2511.02600

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓