arXiv Open Access 2026

When World Models Dream Wrong: Physical-Conditioned Adversarial Attacks against World Models

Zhixiang Guo Siyuan Liang Andras Balogh Noah Lunberry Rong-Cheng Tu +2 lainnya
Lihat Sumber

Abstrak

Generative world models (WMs) are increasingly used to synthesize controllable, sensor-conditioned driving videos, yet their reliance on physical priors exposes novel attack surfaces. In this paper, we present Physical-Conditioned World Model Attack (PhysCond-WMA), the first white-box world model attack that perturbs physical-condition channels, such as HDMap embeddings and 3D-box features, to induce semantic, logic, or decision-level distortion while preserving perceptual fidelity. PhysCond-WMA is optimized in two stages: (1) a quality-preserving guidance stage that constrains reverse-diffusion loss below a calibrated threshold, and (2) a momentum-guided denoising stage that accumulates target-aligned gradients along the denoising trajectory for stable, temporally coherent semantic shifts. Extensive experimental results demonstrate that our approach remains effective while increasing FID by about 9% on average and FVD by about 3.9% on average. Under the targeted attack setting, the attack success rate (ASR) reaches 0.55. Downstream studies further show tangible risk, which using attacked videos for training decreases 3D detection performance by about 4%, and worsens open-loop planning performance by about 20%. These findings has for the first time revealed and quantified security vulnerabilities in generative world models, driving more comprehensive security checkers.

Topik & Kata Kunci

Penulis (7)

Z

Zhixiang Guo

S

Siyuan Liang

A

Andras Balogh

N

Noah Lunberry

R

Rong-Cheng Tu

M

Mark Jelasity

D

Dacheng Tao

Format Sitasi

Guo, Z., Liang, S., Balogh, A., Lunberry, N., Tu, R., Jelasity, M. et al. (2026). When World Models Dream Wrong: Physical-Conditioned Adversarial Attacks against World Models. https://arxiv.org/abs/2602.18739

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓