DOAJ Open Access 2026

EO-MADDPG: An Improved Reinforcement Learning Approach for Multi-UAV Pursuit–Evasion Games

Xiao Wang Mengyu Wang Xueqian Bai Zhe Ma Kewu Sun +1 lainnya

Abstrak

To advance research in multi-agent reinforcement learning (MARL) for pursuit–evasion scenarios, this paper introduces a novel algorithm called Expert Knowledge and Opponent Modeling Multi-UAV Deep Deterministic Policy Gradient (EO-MADDPG). EO-MADDPG consists of two key components: the integration of expert knowledge and real-time sampled data and the prediction of evader UAV actions. The expert knowledge includes a multi-UAV formation control algorithm and an encirclement strategy, which incorporates consensus algorithms and Apollonius circle guidance. Additionally, the network-training framework is optimized by integrating information about opponent actions under a fixed policy for improved prediction accuracy. The experiments focus on three vs. one and three vs. two scenarios, where pursuer UAVs utilize EO-MADDPG and evader UAVs follow fixed policies with Gaussian perturbations. Experimental results show that EO-MADDPG achieves success rates of 99.9 ± 0.3% and 97.5 ± 1.4% (mean ± std over five seeds) in three vs. one and three vs. two pursuit–evasion simulations, respectively, outperforming the baseline MADDPG (72.7 ± 6.0% and 64.4 ± 34.4%). Ablation studies and cooperative landmark tasks further demonstrate improved training stability and interpretability.

Penulis (6)

X

Xiao Wang

M

Mengyu Wang

X

Xueqian Bai

Z

Zhe Ma

K

Kewu Sun

J

Jiake Li

Format Sitasi

Wang, X., Wang, M., Bai, X., Ma, Z., Sun, K., Li, J. (2026). EO-MADDPG: An Improved Reinforcement Learning Approach for Multi-UAV Pursuit–Evasion Games. https://doi.org/10.3390/aerospace13030296

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3390/aerospace13030296
Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.3390/aerospace13030296
Akses
Open Access ✓