arXiv
Open Access
2023
Step by Step Loss Goes Very Far: Multi-Step Quantization for Adversarial Text Attacks
Piotr Gaiński
Klaudia Bałazy
Abstrak
We propose a novel gradient-based attack against transformer-based language models that searches for an adversarial example in a continuous space of token probabilities. Our algorithm mitigates the gap between adversarial loss for continuous and discrete text representations by performing multi-step quantization in a quantization-compensation loop. Experiments show that our method significantly outperforms other approaches on various natural language processing (NLP) tasks.
Topik & Kata Kunci
Penulis (2)
P
Piotr Gaiński
K
Klaudia Bałazy
Akses Cepat
Informasi Jurnal
- Tahun Terbit
- 2023
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓